The following test case highlights about improving critical paths through restructuring, such as when pushing macro (block RAM) closer to the destination register.
The following figure shows a 16x1 Multiplexer with only one input to the Multiplexer coming from block RAM and the rest of the inputs being fed by registers.
Critical path: block RAM-> 2 Logic levels -> FF.
The following figure shows the critical path where the block RAM to FF path is highlighted in red. There are 2 logic levels from block RAM->FF as well as FF->FF. Because block RAM CLK->Q delay is higher for block RAM, block RAM->FF is critical.
Next, look at the RTL code snippet shown in the following figure to see whether there is a way to restructure the logic.
The optimal way to restructure the logic is to rewrite the above code snippet by breaking the 16x1 Multiplexer into two multiplexers. You can exempt the condition of select value 4'd5 and use it as an enabling condition for the 2x1 Multiplexer as shown in the following figure, creating this cascade Multiplexer structure results in FF->FF with 3 logic levels, but block ;RAM->FF is reduced to 1 logic level. This way, the block RAM->FF path has been improved, which helps the downstream tools for better placement because RAMB placement is more challenging than LUT and FF placement. In general, fewer long paths around Macro primitives such RAMB, URAM, and DSP will yield better QoR for any given design.