Each AFID listed in the table below is associated with distinct Category, Type, and Severity fields, which are explained in the following sections.
Category
Below are the definitions for the five categories, each corresponding to specific events.
- Off-Package Link Errors: Covers events such as WAFL and XGMI, ranging in severity from Corrected to Fatal.
- HBM Errors: Encompasses issues like Bad Page Retirement Threshold and On-die ECC, with severities from Corrected to Fatal.
- Device Internal Errors: Includes Hardware Assertion and Watchdog Timeout, assessed from Corrected to Fatal.
- CPER Format Errors: Indicates events related to Malformed or Invalid ACA Data, marked as affecting all severities.
- Unidentified Errors: Represents general unidentified events impacting all severities.
Type
The Type can refer to a specific component, message, event, or error within the specified Category. Examples include On-die ECC and Watchdog Timeout.
Severity
Severity of the failure is linked to specific events. The four severity levels defined in this list are:
- Corrected: These are events that have been corrected by the hardware and are reported for diagnostic purposes.
- Fatal: These are events which trigger a device reset for recovery.
- Uncorrected, Non-fatal: These are events which cannot be corrected but can be contained.
- ALL: This can be of any of the three aforementioned severities.
| AFID | Category | Type | Severity |
|---|---|---|---|
| 15 | Off-Package Link Errors | WAFL | Corrected |
| 16 | Off-Package Link Errors | WAFL | Fatal |
| 17 | Off-Package Link Errors | XGMI | Corrected |
| 18 | Off-Package Link Errors | XGMI | Fatal |
| 19 | HBM Errors | Bad Page Retirement Threshold | Fatal |
| 20 | HBM Errors | On-die ECC | Fatal |
| 21 | HBM Errors | End-to-end CRC | Fatal |
| 22 | HBM Errors | On-die ECC | Uncorrected, Non-fatal |
| 23 | HBM Errors | End-to-end CRC | Uncorrected, Non-fatal |
| 24 | HBM Errors | All | Corrected |
| 25 | HBM Errors | All Others | Fatal |
| 26 | Device Internal Errors | Hardware Assertion (HWA) | Fatal |
| 27 | Device Internal Errors | Watchdog Timeout (WDT) | Fatal |
| 28 | Device Internal Errors | All Others | Uncorrected, Non-fatal |
| 29 | Device Internal Errors | All Others | Corrected |
| 30 | Device Internal Errors | All Others | Fatal |
| 31 | CPER Format | Malformed CPER | ALL |
| 32 | CPER Format | Incomplete ACA Data | ALL |
| 33 | CPER Format | Invalid ACA Data | ALL |
| 34 | Unidentified Errors | Unidentified Error | ALL |
Note: While each AFID corresponds to a specific event, not all events
necessitate immediate attention or action. For more detailed information, users are
encouraged to contact their service provider.