Clock Multiplexing - 2025.2 English - UG1387

Versal Adaptive SoC Hardware, IP, and Platform Development Methodology Guide (UG1387)

Document ID
UG1387
Release Date
2025-12-17
Version
2025.2 English

You can build a clock multiplexer using a combination of parallel and cascaded BUFGCTRLs. The placer finds the optimal placement based on the clock buffer site availability. If possible, the placer places BUFGCTRLs in adjacent sites to take advantage of the dedicated cascade paths. If that is not possible, the placer attempts to place the BUFGCTRLs from the same level in the adjacent clock regions.

The following figure shows a 4:1 MUX with balanced cascading. The first level of BUFGCTRL buffers are both placed in the directly adjacent sites (X10Y7, X10Y5) of the last BUFGCTRL (X10Y6). This configuration ensures a comparable insertion delay for all the clocks reaching the last BUFGCTRL. You can use an equivalent structure for a 3:1 MUX.

Figure 1. 4:1 MUX Using Parallel BUFGCTRL

When creating a 5:1 or larger clock MUX structure, it is common to create a symmetrical clock structure as shown in the following figure. However, this is a suboptimal solution, because each BUFGCTRL only has one cascade path to the two adjacent BUFGCTRLs, which does not provide minimal delay for all connections between the BUFGCTRLs.

Figure 2. Non-Recommended 8:1 Balanced Clock MUX Structure

To support larger clock multiplexers (from 5:1 to 8:1 MUX), AMD recommends using cascaded BUFGCTRL buffers as shown in the following figures.

The following figure shows an optimal 8:1 MUX that uses seven BUFGCTRL buffers with more evenly distributed latency through the BUFGCTRL cascade amongst the clocks.

Figure 3. Optimal 8:1 MUX Using Cascaded BUFGCTRL with Evenly Distributed Latency

The following figure shows an 8:1 MUX that uses seven BUFGCTRL buffers with highly unbalanced latency through the BUFGCTRL cascade amongst the clocks.

Figure 4. 8:1 MUX Using Cascaded BUFGCTRL with Highly Unbalanced Latency

When using wide BUFGCTRL-based clock multiplexers, the clock insertion delays cannot be balanced because some paths are longer than other paths in hardware. Therefore, this method is recommended for multiplexing asynchronous clocks only.