Distributed RAM (SLICEM Only) - UG474

7 Series FPGAs Configurable Logic Block User Guide (UG474)

Document ID
UG474
Release Date
2025-04-01
Revision
1.9 English

The function generators (LUTs) in SLICEMs can be implemented as a synchronous RAM resource called a distributed RAM element. Multiple LUTs in a SLICEM can be combined in various ways to store larger amount of data. RAM elements are configurable within a SLICEM to implement these configurations:

  • Single-Port 32 x 1-bit RAM
  • Dual-Port 32 x 1-bit RAM
  • Quad-Port 32 x 2-bit RAM
  • Simple Dual-Port 32 x 6-bit RAM
  • Single-Port 64 x 1-bit RAM
  • Dual-Port 64 x 1-bit RAM
  • Quad-Port 64 x 1-bit RAM
  • Simple Dual-Port 64 x 3-bit RAM
  • Single-Port 128 x 1-bit RAM
  • Dual-Port 128 x 1-bit RAM
  • Single-Port 256 x 1-bit RAM

Distributed RAM modules are synchronous (write) resources. A synchronous read can be implemented with a flip-flop in the same slice. By using this flip-flop, the distributed RAM performance is improved by decreasing the delay into the clock-to-out value of the flip-flop. However, an additional clock latency is added. The distributed elements share the same clock input. For a write operation, the Write Enable (WE) input, driven by either the CE or WE pin of a SLICEM, must be set High.

The following table shows the number of LUTs (four per slice) occupied by each distributed RAM configuration. See Vivado Design Suite 7 Series FPGA and Zynq 7000 SoC Libraries Guide (UG953) for details of available distributed RAM primitives.

Table 1. Distributed RAM Configuration
RAM Description Primitive Number of LUTs
32 x 1S Single port RAM32X1S 1
32 x 1D Dual port RAM32X1D 2
32 x 2Q Quad port RAM32M 4
32 x 6SDP Simple dual port RAM32M 4
64 x 1S Single port RAM64X1S 1
64 x 1D Dual port RAM64X1D 2
64 x 1Q Quad port RAM64M 4
64 x 3SDP Simple dual port RAM64M 4
128 x 1S Single port RAM128X1S 2
128 x 1D Dual port RAM128X1D 4
256 x 1S Single port RAM256X1S 4

Distributed RAM configurations include:

  • Single port
    • Common address port for synchronous writes and asynchronous reads
      • Read and write addresses share the same address bus
  • Dual port
    • One port for synchronous writes and asynchronous reads
      • One function generator is connected with the shared read and write port address
    • One port for asynchronous reads
      • Second function generator has the A inputs connected to a second read-only port address, and the WA inputs are shared with the first read/write port address
  • Simple dual port
    • One port for synchronous writes (no data out/read port from the write port)
    • One port for asynchronous reads
  • Quad port
    • One port for synchronous writes and asynchronous reads
    • Three ports for asynchronous reads

As shown in Figure 1, the common write port W6:W1 (WA[6:1] in the following figures) is always physically driven by the inputs to the D LUT using D[6:1]. The read ports are independent for each of the four LUTs. Therefore the D LUT is always effectively single port, even if DPRAM64 is selected as the LUT configuration. The other three LUTs are always effectively dual port, although SPRAM32 can be selected when the read and write addresses are connected together.

Figure 1 through Figure 9 illustrate various example distributed RAM configurations occupying one SLICEM.

When using x2 configurations (as in 32 X 2 Quad Port in Figure 1), A6 and WA6 are driven High by the software to keep O5 and O6 independent.

Figure 1. 32 X 2 Quad Port Distributed RAM (RAM32M)
Figure 2. 32 X 6 Simple Dual Port Distributed RAM (RAM32M)
Figure 3. 64 X 1 Single Port Distributed RAM (RAM64X1S)

If four single-port 64 x 1-bit modules are each built as shown in Figure 3, the four RAM64X1S primitives can occupy a SLICEM, as long as they share the same clock, write enable, and shared read and write port address inputs. This configuration equates to a 64 x 4-bit single-port distributed RAM.

If two dual-port 64 x 1-bit modules are each built as shown in Figure 4, the two RAM64X1D primitives can occupy a SLICEM, as long as they share the same clock, write enable, and shared read and write port address inputs. This configuration equates to a 64 x 2-bit dual-port distributed RAM.

Figure 4. 64 X 1 Dual Port Distributed RAM (RAM64X1D)
Figure 5. 64 X 1 Quad Port Distributed RAM (RAM64M)
Figure 6. 64 X 3 Simple Dual Port Distributed RAM (RAM64M)

Implementation of distributed RAM configurations with depth greater than 64 requires the usage of wide-function multiplexers (F7AMUX, F7BMUX, and F8MUX), as shown in Figure 7 through Figure 9.

If two single-port 128 x 1-bit modules are each built as shown in Figure 7, the two RAM128X1S primitives can occupy a SLICEM, as long as they share the same clock, write enable, and shared read and write port address inputs. This configuration equates to 128 x 2-bit single-port distributed RAM.

Figure 7. 128 X 1 Single Port Distributed RAM (RAM128X1S)
Figure 8. 128 X 1 Dual Port Distributed RAM (RAM128X1D)
Figure 9. 256 X 1 Single Port Distributed RAM (RAM256X1S)

Distributed RAM configurations greater than the provided examples require more than one SLICEM. There are no direct connections between slices to form larger distributed RAM configurations within a CLB or between slices.