Decomposing Deeper Memory Configurations for Balanced Power and Performance - 2021.2 English

UltraFast Design Methodology Guide for Xilinx FPGAs and SoCs

Document ID
UG949
Release Date
2021-11-19
Version
2021.2 English

When working with deeper memory configurations, you can use the RAM_DECOMP synthesis attribute in the RTL to reduce power by improving memory composition. When the RAM_DECOMP attribute is applied to a memory array, the memory logic is mapped to a wider array of block RAM primitives. To balance power and performance, you can control cascading using the CASCADE_HEIGHT attribute along with the RAM_DECOMP attribute. This approach requires more address decoding logic but helps to reduce the number of block RAMs that are enabled for each read operation, which helps to reduce power.

For example, the following figure shows a 32x16K memory configuration.

Figure 1. 32x16K Memory Configuration
Generated by Your Tool
If you apply the following attributes:
ram_decomp = "power"
cascade_height = 4

16 RAMB36E2 is inferred and the memory is decomposed as follows:

  • The base primitive is 32x1K.
  • 4 block RAMs are cascaded to create a 32x4K configuration.
  • 4 parallel structures create a 16K deep memory.
  • The outputs are multiplexed to generate the output data.
    Figure 2. Generated Structure for 32x16K Memory Configuration Example Using CASCADE_HEIGHT and RAM_DECOMP Attributes

The following RTL code example shows the use of the CASCADE_HEIGHT and RAM_DECOMP attributes.

Figure 3. RTL Code for 32x16K Memory Configuration Using the CASCADE_HEIGHT and RAM_DECOMP Attributes

If you apply only the ram_decomp = "power" attribute, 16 RAMB36E2 are inferred and the memory is decomposed as follows:

  • The base primitive is 32x1K.
  • 8 block RAMs are cascaded to create a 32x8K configuration.
  • 2 parallel structures create a 16K deep memory.
  • The outputs are multiplexed into a 2:1 MUX to generate the output data.
    Figure 4. Generated Structure for 32x16K Memory Configuration Using the RAM_DECOMP Attribute

The following RTL code example shows the use of the RAM_DECOMP attribute.

Figure 5. RTL Code for 32x16K Memory Configuration Using the RAM_DECOMP Attribute

If you use only the RAM_DECOMP attribute, the overall power savings is similar to using both the RAM_DECOMP and CASCADE_HEIGHT attributes together, because only one block RAM is active at a time. Creating a 4-deep cascaded block RAM chain is better for performance when compared to an 8-deep cascaded block RAM chain.

For more information, see this link in the Vivado Design Suite User Guide: Synthesis (UG901).