In deep memory configurations, the synthesis attribute RAM_DECOMP can be
applied in the RTL to improve memory decomposition and reduce power consumption. When
applied, the memory is configured in a wider arrangement of primitives instead of a deep
and narrow configuration.
When the CASCADE_HEIGHT attribute is used together with
RAM_DECOMP, synthesis inference gains more granular control over
cascading. This provides balanced power and performance. While this approach requires
additional address decoding logic, it reduces the number of block RAMs accessed at a
time, helping to lower power consumption.
For example, applying
RAM_DECOMP = power and
CASCADE_HEIGHT = 4 infers 16 RAMB36E2
blocks and decomposes the memory as shown below.
The base primitive in this configuration is 32 × 1K. Four block RAMs are cascaded to form a 32 × 4K configuration. Four such parallel structures create a 16K-deep memory, with outputs multiplexed to generate the output data.
If only the
RAM_DECOMP
= power, 16 RAMB36E2 blocks are still inferred, but the
decomposition changes as shown in the following figure.
RAM_DECOMP Attribute
In this case, the base primitive is 32 × 1K, with eight block RAMs cascaded to form a 32 × 8K configuration. Two such parallel structures create a 16K-deep memory, with outputs multiplexed through a 2:1 MUX.
RAM_DECOMP Attribute
Power savings are similar in both configurations (Figure 2 and Figure 4), because only one block RAM is active at a time. However, performance differs: a four-level deep cascaded block RAM chain (Figure 2) provides better performance than an eight-level deep cascaded block RAM chain (Figure 4).