The Vivado synthesis engine by default uses a balanced way of decomposing user-specified RTL RAM block into several BRAM/URAM primitives exploiting the built-in cascade chains as well as using external multiplexers based on the configurations and user requirements. The mentioned approach works well for the majority of the scenarios.
In cases where the user RTL RAM size (depth x width) does not divide evenly with primitive sizes, alternative algorithms can improve the utilization of RAM components. Setting the RAM_DECOMP attribute to "area" on the user-defined RTL_RAM, instructs the synthesis tool to optimize the area for different primitive sizes.
An example below shows where the RAM_DECOMP attribute is set to “area” and shows area recovery when compared to the default memory decomposition algorithm.
Default algorithm shows 96 URAM288E5 primitives are inferred and the memory is decomposed as follows:
- The base primitive is 32Kx8.
- 96 parallel structures create a 768 wide bus memory.
Area optimal algorithm shows 77 URAM288E5 primitives are inferred and the memory is decomposed as follows:
- The base primitive is 4Kx72.
- 7 UltraRAMs are cascaded to create a 28Kx72 configuration.
- 11 parallel structures create a 768 wide bus memory.