Optimizing RAMB Input Logic to Allow Output Register Inference - 2025.2 English - UG906

Vivado Design Suite User Guide: Design Analysis and Closure Techniques (UG906)

Document ID
UG906
Release Date
2025-12-10
Version
2025.2 English

The following RTL code snippet generates a critical path from a block RAM (in this case, a ROM) with multiple logic levels ending at a flip-flop (FF). The RAMB cell is inferred without the optional output registers (DOA-0), adding more than 1 ns of extra delay to the RAMB output path.

Figure 1. Memory RTL Code Without Inferred RAMB Output Register

The critical path for this RTL code is shown in the next figure.

Figure 2. Critical Path from RAMB Without Output Register Enabled

It is good practice to review critical paths after synthesis and after each implementation step to identify which groups of logic require improvement. For long paths, or paths that do not take optimal advantage of FPGA hardware features, examine the RTL description to understand why the synthesized logic is not optimal and modify the code to help the synthesis tool improve the netlist.

Vivado provides an embedded debugging mechanism through the elaborated view, which helps identify where inefficiencies occur without manually searching through the RTL code.

Figure 3. Elaborated View of RTL Code Snippet

In this case, the problem arises from the address register fanout (addr_reg3_reg), which drives both the memory address and some interconnect logic (highlighted in blue). RAMB inference by the synthesis tool requires a dedicated address register in the RTL code, which is incompatible with the current address register fanout. As a result, the synthesis tool re-times the output register to allow RAMB inference instead of using it to enable the RAMB optional output register.

By replicating the address register in the RTL code so that the memory address and interconnect logic are driven by separate registers, the RAMB can be inferred with the output registers enabled.

Figure 4. RTL Code with the Replicated Address Register

Figure 5. Elaborated View of the Replicated Address Register

The critical path for the modified RTL code shows the improvement:

  • The addr_reg2_reg register is connected to the address pin of the block RAM.
  • The addr_reg3_reg register has been absorbed in the block RAM.
  • The RAMB output register is enabled, significantly reducing the datapath delay on the RAMB outputs.
    Figure 6. Critical Path for the Modified RTL Code