The AI Engine kernel
partition driver provides the ability to inquire about device errors through the sysfs
interface and provides the main status and PC
registers to help you debug runtime issues. The sysfs
entries dump the errors and status registers in readable, scripting friendly formats.
Also, for every tile there are sysfs
entries to show
the core, DMA and locks status, and PC registers. This provides help in debugging Linux
runtime issues. AI Engine
kernel driver also provides sysfs
entry for core dump
and errors. At runtime, you or the runtime utilities can check the sysfs
to see all errors that have occurred for the
application. The errors are in a readable format and easy for scripting.
The sysfs
core dump entries read the core, DMA and locks
status, and PC registers from the hardware and show the value in a readable and script
friendly format. If the application stalls at runtime, such as if there is no output
from the AI Engine, sysfs
shows where it is stalling. Runtime utilities or offline tools can make use of the core
dump data with the AI Engine compiler generated
graph information to debug the application with more details. The following shows the
sysfs
entries structure:
/sys/class/aie/aiepart_<startcol_numcols>/
|-- <col_row>
| |-- core - For AIE array tiles only.
| |-- dma - For NoC tiles and array tiles only.
| |-- error
| |-- event
| `-- lock - For NoC tiles and array tiles only.
.
.
.
|-- core
|-- dma
|-- error
|-- error_stat
|-- lock
`-- status
The following is an example output for AI Engine partition status. At the aperture level there is one
hardware_info
node to show device generation, number of
row/columns, and tile types.
# Cat out info from sysfs node
xilinx-vek280-es1-20231:~$ cat /sys/class/aie/aieaperture_0_38/hardware_info
generation: aieml
total_cols: 38
total_rows: 11
shim_tile: start row: 0, num_rows: 1
memory_tile: start row: 1, num_rows: 2
aie_tile: start row: 3, num_rows: 8
The format of the output is device generation independent.