The selection of clocks and associated frequencies is an important
part of defining the performance of algorithms and signal processing blocks of a
system. Depending on the source of the input data and destination of the results,
different solutions need to be engineered to meet the design requirements. This
section describes the clocking of processing elements such HLS, RTL PL kernels or
AI Engine kernels
added with Vitis to an extensible platform. The
interface to input data and output results can be categorized as streaming or memory
access.
Note: The number of clocking
resources available is device dependent and it is recommended to carefully plan
the clock usage.
For streaming access, the bit-width of the data and clock frequency
determines the throughput. The HLS, RTL or AI Engine kernels processing the data need to sustain the
throughput to avoid loss of data. The throughput used here is defined
as:
throughput = bit-width * clock frequency in Hz /
initiation interval (bits /second)
For AI Engine, the interface
clock frequency is specified on PLIO to determine the DMA scheduling or stream
access rate, but the kernel itself always run at AIE Clock. For details see the
topic AI Engine-to-PL Rate Matching.
Data access type | Design impact for Vitis | Comments |
---|---|---|
Synchronous single-rate | Connect to platform clocks and data paths | Kernel clocks match the source and destination in the platform. |
Figure 1. Synchronous single-rate with clock
from platform
|
||
Synchronous multi-rate |
|
To reduce the clock rate while maintaining
throughput requirements, the bit-width can be increased. Vitis will add a Data Width
Converter (DWC) block to manage the relationship between the
bit-width and clocks whose frequency is in a powers of 2
relation. If necessary, Vitis
will also infer missing clocks. For non powers of 2 relations,
you need advanced clocking and handshaking techniques. Refer to
Versal
Adaptive SoC Clocking Resources Architecture Manual (AM003)
and
Clocking Wizard for Versal Adaptive SoC LogiCORE IP Product
Guide (PG321). Note: A
multi-rate design is synchronous if the clocks have rational
relation and a common reference (originates from the same
PLL/MMCM). If only one of the multi-rate clocks exist in the
platform, Vitis will infer
a clock wizard to satisfy this condition.
|
Figure 2. Synchronous multi-rate with clock
from platform and inferred DWC
|
||
Figure 3. Synchronous multi-rate with
inferred clocks and CDC
|
||
Time division multiplexing |
|
The kernel exploits running at higher
throughput than incoming data by buffering the incoming data and
processing each buffer in sequence. This is closely related to
multi-rate signal processing, except that having buffers is
mandatory. Note:
Vitis can infer DWC and
additional clocks to support powers of 2 rate changes. The
multiplexing mechanism and buffers need to be designed by
the kernel developer.
|
Packet-switching | Need buffers and logic for handling the control and payload. | Similar to multi-rate, but can require
clock rate overhead to manage the control headers. Note:
Vitis does not support
automatically inferring packet switching. The packet
handling mechanism need to be designed by the kernel
developer.
|
Asynchronous |
|
CDC (Clock domain crossing) logic is required to transfer data
across unrelated clock domains. The processing kernel throughput
needs to be equal or higher than the input data. FIFO buffers
need to be inserted to handle differences in throughput and
stall handshaking. Refer to Specifying Streaming Connectionsfor adding FIFO. Note:
AI Engine is clocked by a separate
PLL. Even if the PLIO has same frequency, the phase relation
is unknown so CDC are always inserted by Vitis.
|
Figure 4. Asynchronous single-rate with CDC
related to AI Engine PLIO
|
Note: The figures above are
simplified for illustrative purposes. The data produced and consumed can be the
same block or multiple blocks. The HLS and AI Engine can interact with as many
other kernels as resource and interface permits.
In Vitis, all memory types are modeled as AXI slave interfaces with metadata, and any clock conversion is handled implicitly through the AXI network connecting kernel AXI master to the memory AXI slave. In cases with HLS kernels converting from memory to stream or stream to memory, or if the memory access has contention from several kernels, the HLS kernel coding may need performance optimizations to meet the required access type. This is also true if the access is to a shared resource. For details on connecting memories, refer toMapping Kernel Ports to Memory. For details on HLS coding, refer to Memory Mapped Interfaces in the Vitis High-Level Synthesis User Guide (UG1399).
The extensible platform XSA contains information of available clock
domains. Running the
For details on how to use link options, refer to --clock Options.
platforminfo -d
<platformname>.xsa
utility will list the platform clock domains
under Clocking
Information. For further details and examples, refer to
Identifying Platform Clocks.
Expected outcome | Link options | Comment |
---|---|---|
Use default platform clock source pin | Not needed | Vitis automatically connects unspecified kernel to default clock source pin. |
Use non-default platform clock | Use clock.id
|
All kernel clock pins will be driven by the platform clock source
pin with ID. If the kernel contains multiple clock pins, you can
specify
<kernel>.<clk
pin>
to differentiate between the
clocks. |
|
The requested frequency must match any existing platform clocks within the tolerance range. Unless explicitly set, the default tolerance will be 5%. | |
Add a new clock | Use freqhz
|
Vitis will add a clock wizard to generate the requested clock frequency. In some cases when it's not possible to generate the exact frequency, the tool will generate the closest acceptable frequency within the tolerance range. |