The user errors that typically create application hangs are listed below:
- Read-before-write in 5.0+ target platforms causes a Memory Interface Generator error correction code (MIG ECC) error. This is typically a user error. For example, this error might occur when a kernel is expected to write 4 KB of data in DDR, but it produces only 1 KB of data, and then try to transfer the full 4 KB of data to the host. It can also happen if you supply a 1 KB buffer to a kernel, but the kernel tries to read 4 KB of data.
- An ECC read-before-write error also occurs if no data has been written to a
memory location as the last bitstream download which results in MIG initialization,
but a read request is made for that same memory location. ECC errors stall the
affected MIG because kernels are usually not able to handle this error. This can
manifest in two different ways:
- The CU might hang or stall because it cannot handle this
error while reading or writing to or from the affected MIG. The
xbutil
query shows that the CU is stuck in aBUSY
state and is not making progress. - The AXI Firewall might trip if a
PCIe®
DMA request is made to the affected MIG, because the
DMA engine is unable to complete the request. AXI Firewall trips result in
the Linux kernel driver killing all processes which have opened the device
node with the
SIGBUS
signal. Thexbutil
query shows if an AXI Firewall has indeed tripped and includes a timestamp.
- The CU might hang or stall because it cannot handle this
error while reading or writing to or from the affected MIG. The