Hardware issues can range from link bring-up to problems seen after hours of testing. This section provides debug steps for common issues. The Vivado Design Suite debug feature is a valuable resource to use in hardware debug. The signal names mentioned in the following individual sections can be probed using the Vivado Design Suite debug feature for debugging the specific problems.
Many of these common issues can also be applied to debugging design simulations. See the following sections for details.General Checks
- Ensure that all the timing constraints for the core were met during Place and Route.
- Ensure that all clock sources are clean. If using DCMs
in the design, ensure that all DCMs have obtained lock by monitoring the
lockedport.
Problems with the MDIO
- Ensure that the MDIO is driven properly. See MDIO Management System for detailed information about performing MDIO transactions.
- Check that the
mdcclock is running and that the frequency is 2.5 MHz or less. - Read from a configuration register that does not have
all 0s as a default. If all 0s are read back, the read was unsuccessful. Check that the
PHYAD field placed into the MDIO frame matches the value placed on the
phyad[4:0]port of the core.
Problems with Data Reception or Transmission
When no data is being received or transmitted on a channel:
- Ensure that a valid link has been established between
the core and its link partner, either by Auto-Negotiation or Manual Configuration;
status_vector_chx[0]andstatus_vector_chx[1]should both be High. If no link has been established, see the topics discussed in the next section. - Transmission through the core is not allowed unless a link has been established. This behavior can be overridden by setting the Unidirectional Enable bit.
- Ensure that the Isolate state has been disabled.
By default, the Isolate state is enabled after power-up. In the PHY mode, the PHY is electrically isolated from the GMII; for an internal GMII, it behaves as if it is isolated. This results in no data transfer across the GMII.
Problems with Auto-Negotiation
Determine whether Auto-Negotiation has completed successfully by doing one of the following.
- Poll the Auto-Negotiation completion bit 1.5 in MDIO register 1: Status register.
- Use the Auto-Negotiation interrupt port of the core.
If Auto-Negotiation is not completing, ensure that Auto-Negotiation is enabled in both the core and in the link partner (the device or test equipment connected to the core). Auto-Negotiation cannot complete successfully unless both devices are configured to perform Auto-Negotiation.
The Auto-Negotiation procedure requires that the Auto-Negotiation handshaking protocol between the core and its link partner, which lasts for several link timer periods, occur without a bit error. A detected bit error causes Auto-Negotiation to go back to the beginning and restart. Therefore, a link with an exceptionally high bit error rate might not be capable of completing Auto-Negotiation, or might lead to a long Auto-Negotiation period caused by the numerous Auto-Negotiation restarts. If this appears to be the case, try the next step and see Problems with a High Bit Error Rate
Try disabling Auto-Negotiation in both the core and the link partner and see if both devices report a valid link and can pass traffic. If they do, it proves that the core and link partner are otherwise configured correctly. If they do not pass traffic, see Problems in Obtaining a Link (Auto-Negotiation Disabled).Problems in Obtaining a Link (Auto-Negotiation Disabled)
Determine whether the device has successfully obtained a link with its link partner by doing the following:
- Reading bit 1.2, Link Status, in MDIO register 1: Status
register using the optional MDIO management interface (or look at
status_vector_chx[1]). - Monitoring the state of
status_vector_chx[0]. If this is logic 1, then synchronization, and therefore a link, has been established. See Bit[0]: Link Status.
If the devices have failed to form a link then do the following:
- Ensure that Auto-Negotiation is disabled in both the core and in the link partner (the device or test equipment connected to the core).
- Monitor the state of the
signal_detectsignal input to the core. This should either be: - Connected to an optical module to detect the presence of light. Logic 1 indicates that the optical module is correctly detecting light; logic 0 indicates a fault. Therefore, ensure that this is driven with the correct polarity.
- Signal must be tied to logic 1 (if not connected to an optical module).
- When
signal_detectis set to logic 0, this forces the receiver synchronization state machine of the core to remain in the loss of sync state. - See Problems with a High Bit Error Rate in a subsequent section.
When using a device-specific transceiver, perform these additional checks:
- Ensure that the polarities of the
txn/txpandrxn/rxplines are not reversed. If they are, this can be fixed by using thetxpolarityandrxpolarityports of the device-specific transceiver. - Check that the device-specific
transceiver is not being held in reset by monitoring the
mgt_tx_resetandmgt_rx_resetsignals between the core and the device-specific transceiver. If these are asserted, this indicates that the PMA Phase-Locked Loop (PLL) circuitry in the device-specific transceiver has not obtained lock; check the PLL Lock signals output from the device-specific transceiver.
Problems with a High Bit Error Rate
Symptoms
The severity of a high-bit error rate can vary and cause any of the following symptoms:
- Failure to complete Auto-Negotiation when Auto-Negotiation is enabled.
- Failure to obtain a link when Auto-Negotiation is disabled in both the core and the link partner.
- High proportion of lost packets when passed between two connected devices that are capable of obtaining a link through Auto-Negotiation or otherwise. This can usually be accurately measured if the Ethernet MAC attached to the core contains statistic counters.
- All bit errors detected by the QSGMII Receive channel logic during frame reception show up as Frame Check Sequence Errors in an attached Ethernet MAC.
Debugging
- Compare the issue across several devices or PCBs to ensure that the issue is not a one-off case.
- Try using an alternative link partner or test equipment and then compare results.
- Try swapping the optical module on a misperforming device and repeat the tests.
rxdisperr
rxnotintable
These signals should not be asserted over the duration of a few seconds, minutes, or even hours. If they are frequently asserted, it might indicate an issue with the device-specific transceiver.
- Place the device-specific transceiver into parallel or serial loopback.
- If the core exhibits correct operation in device-specific transceiver serial loopback, but not when loopback is performed through an optical cable, it might indicate a faulty optical module.
- If the core exhibits correct operation in device-specific transceiver parallel loopback but not in serial loopback, this can indicate a device-specific transceiver issue.