Useful Debug Operating System Commands - UG1260

KCU1500 Board User Guide (UG1260)

Document ID
UG1260
Release Date
2023-07-27
Revision
1.5 English

This section includes some typical outputs for a VCU1525 board when running debug commands. Only the relevant output relating to accelerator boards is included. Some numbers can change depending on the deployment machine.

lspci: command on Unix-like operating systems that prints ("lists") detailed information about all buses and devices in the system. It is based on a common portable library, libpci, that provides access to the configuration space on a variety of operating systems. Use this command to check if the accelerator board is booted up correctly, is recognized as a device, and is enumerated. The lspci command lists the controller addresses as follows:

$ lspci

...

03:00.0 Serial controller: AMD Corporation Device 6a90

03:00.1 Serial controller: AMD Corporation Device 6a8f

...

Output lines for this command have been omitted for brevity here.

If using a board (as opposed to a board in a cloud environment), use the following command to query using only the vendor_ID (10ee for AMD):

$ lspci -d 10ee:

03:00.0 Serial controller: AMD Corporation Device 6a90

03:00.1 Serial controller: AMD Corporation Device 6a8f

Where:

-d is the device.

10ee is the ID.

A more verbose output can be generated as follows:

$ lspci -vv -d 10ee:

03:00.0 Serial controller: AMD Corporation Device 6a90 (prog-if 01 [16450])

   Subsystem: AMD Corporation Device 4351

   Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-

   Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-

   Latency: 0, Cache Line Size: 64 bytes

   Interrupt: pin A routed to IRQ 46

   Region 0: Memory at f6000000 (32-bit, non-prefetchable) [size=32M]

   Region 1: Memory at f8040000 (32-bit, non-prefetchable) [size=64K]

   Capabilities: <access denied>

   Kernel driver in use: xocl

   Kernel modules: xocl

03:00.1 Serial controller: AMD Corporation Device 6a8f (prog-if 01 [16450])

   Subsystem: AMD Corporation Device 4351

   Control: I/O+ Mem+ BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-

   Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-

   Region 0: Memory at f4000000 (32-bit, non-prefetchable) [size=32M]

   Region 2: Memory at f8020000 (32-bit, non-prefetchable) [size=128K]

   Region 4: Memory at f8000000 (32-bit, non-prefetchable) [size=128K]

   Capabilities: <access denied>

   Kernel driver in use: xclmgmt

   Kernel modules: xclmgmt

 

//Using the verbose option shows:

$ lspci -v

03:00.0 Serial controller: AMD Corporation Device 6a90 (prog-if 01 [16450])

   Subsystem: AMD Corporation Device 4351

   Flags: bus master, fast devsel, latency 0, IRQ 46

   Memory at f6000000 (32-bit, non-prefetchable) [size=32M]

   Memory at f8040000 (32-bit, non-prefetchable) [size=64K]

   Capabilities: <access denied>

   Kernel driver in use: xocl

   Kernel modules: xocl

 

03:00.1 Serial controller: AMD Corporation Device 6a8f (prog-if 01 [16450])

   Subsystem: AMD Corporation Device 4351

   Flags: fast devsel

   Memory at f4000000 (32-bit, non-prefetchable) [size=32M]

   Memory at f8020000 (32-bit, non-prefetchable) [size=128K]

   Memory at f8000000 (32-bit, non-prefetchable) [size=128K]

   Capabilities: <access denied>

   Kernel driver in use: xclmgmt

   Kernel modules: xclmgmt

When required, run the lspci command in super-user (sudo) mode to show the information hidden under the access-denied entries. The output is as follows:

$ sudo lspci -v

...

03:00.0 Serial controller: AMD Corporation Device 6a90 (prog-if 01 [16450])

   Subsystem: AMD Corporation Device 4351

   Flags: bus master, fast devsel, latency 0, IRQ 46

   Memory at f6000000 (32-bit, non-prefetchable) [size=32M]

   Memory at f8040000 (32-bit, non-prefetchable) [size=64K]

   Capabilities: [40] Power Management version 3

   Capabilities: [60] MSI-X: Enable+ Count=33 Masked-

   Capabilities: [70] Express Endpoint, MSI 00

   Capabilities: [100] Advanced Error Reporting

   Capabilities: [1c0] #19

   Capabilities: [350] Vendor Specific Information: ID=0001 Rev=1 Len=02c <?>

   Capabilities: [400] Access Control Services

   Kernel driver in use: xocl

   Kernel modules: xocl

 

03:00.1 Serial controller: AMD Corporation Device 6a8f (prog-if 01 [16450])

   Subsystem: AMD Corporation Device 4351

   Flags: fast devsel

   Memory at f4000000 (32-bit, non-prefetchable) [size=32M]

   Memory at f8020000 (32-bit, non-prefetchable) [size=128K]

   Memory at f8000000 (32-bit, non-prefetchable) [size=128K]

   Capabilities: [40] Power Management version 3

   Capabilities: [70] Express Endpoint, MSI 00

   Capabilities: [100] Advanced Error Reporting

   Capabilities: [400] Access Control Services

   Kernel driver in use: xclmgmt

   Kernel modules: xclmgmt

dmesg: use to view the messages from the drivers:

$ dmesg

Another dmesg option is dmesg -T. This option provides a timestamp that is human-readable instead of using the time in seconds since the machine has booted.

If the dmesg command returns too much information, use the following variation, which clears the buffer before the next use:

$ sudo dmesg -C

lsmod: Lets you check if driver modules are loaded in the OS, as follows:

lsmod | grep -E "^xocl|^xclmgmt"

The output would be similar to the following:

xocl     94208  0

xclmgmt  69632  0

The first column shows the xocl and xclmgmt that are driver modules.The second column is the size in memory of the drivers.The third column with the 0 indicates the drivers are currently not in use, also indicating that no application is running on the host and using or accessing the accelerator board.