ECC Scrubbing - 1.1 English - PG313

Versal Adaptive SoC Programmable Network on Chip and Integrated Memory Controller 1.1 LogiCORE IP Product Guide (PG313)

Document ID
PG313
Release Date
2025-05-29
Version
1.1 English

On-the-fly scrubbing occurs when a correctable ECC error is detected on a read transaction. A Read-Modify-Write (RMW) operation is executed at the same memory address. A RMW is used in the event that a write had occurred after the correctable error was detected but before the controller had returned to complete the scrubbing. If an uncorrectable error is detected, on-the-fly scrubbing is not performed. However if both correctable and uncorrectable errors are detected in a single burst, scrubbing is performed.

Background scrubbing is the process of stepping through the DRAM doing RMW to each address to mitigate data loss via single event upset. The memory controller will utilize idle cycles to implement the scrubbing, and in the event of full traffic will periodically insert transactions to ensure progress is made. For DDR4, the background scrubbing period can be set by the user via the GUI. For DDR4, you can set the background scrubbing period in the GUI. The default value will scrub the DDR4 memory space once every 24 hours. For LPDDR4/4X, the background scrubbing period is not a GUI option and is set to a fixed value of 20 μs.

The memory can be initialized with the proper ECC values at the end of memory calibration. The amount of memory to be initialized is configurable, and the memory controller does not execute any user commands until this process is completed. Initialization is not required, however the user must ensure that no reads are issued to an address that hasn't been written to first.

During background scrubbing when correctable errors are detected the corrected data is written back to the DRAM. There is no user notification for correctable errors. When uncorrectable errors are detected on a background scrub, the data is not corrected, but the ECC poison bits are set for two or four consecutive memory bursts, depending on the memory topology. When both correctable and uncorrectable errors are detected during a background scrub, the corrected data is written back to the DRAM, as well as the appropriate ECC poison bits being set. The poisoned ECC data will always align to a 16Byte (128-bit) NoC addressing boundary. DDR4 is always BL8 and LPDDR4 is always BL16.

  • An example for x64 DDR4 (x72 with ECC) and a UC is detected on BL1 of BL[7:0], then the ECC poison is set on BL0 and BL1.
  • An example for x16 LPDDR4 (x24 with ECC) and a UC is detected on BL6 of BL[15:0], then the ECC poison is set on BL[7:4].

Uncorrectable errors detected on a background scrub do not generate a user notification. When the poisoned memory is read back with a normal read command, the ECC error interrupts fire, the ECC status bits update, and an AXI SLVERR occurs.

Table 1. ECC Poisoning Bursts for Uncorrectable Error on Background Scrub
DRAM Bus Configuration BLn Burst with UC Error Detected on Scrub RMW Read-Phase BLn Burst Poisoned by Scrub RMW Write-Phase N Memory Poisoned
DDR4 x72 N N+1 N N+1 0, 2, 4, 6 16 Bytes
DDR4 x40 N N+1 N+2 N+3 N N+1 N+2 N+3 0, 4 16 Bytes
DDR4 x24 N N+1 N+2 N+3 N N+1 N+2 N+3 0, 4 8 Bytes
LPDDR4 x40 N N+1 N+2 N+3 N N+1 N+2 N+3 0, 4, 8, 12 16 Bytes
LPDDR4 x24 N N+1 N+2 N+3 N N+1 N+2 N+3 0, 4, 8, 12 8 Bytes