Refer to the following block diagram of the krnl_cbc kernel. It has four identical CBC engines, which receive input data from AXI read master via engine control unit. They then send the data to and receive output data from the krnl_aes kernel via the AXI stream port, and send the result to AXI write master via the engine control unit.
An AXI control slave module is used to set the necessary kernel arguments. The krnl_cbc kernel finishes the task with input/output grouped words stored in global memory. Each internal engine will handle one words group at one time. Consecutive input groups are assigned to different internal CBC engines in round-robin fashion by engine control module. The krnl_cbc kernel uses a single kernel clock for all internal modules.
The krnl_cbc kernel supports the ap_ctrl_chain execution model. ap_ctrl_chain is an extension to the ap_ctrl_hs model; the kernel execution is divided into input sync and output sync stage. Control signals ap_start and ap_ready are used for input sync, while ap_done and ap_continue are used for output sync. Refer to Supported Kernel Execution Models for detailed explanations.
The following figure demonstrates an example waveform of ap_ctrl_chain module for two beat input sync and two beat output sync (kernel execute two jobs consecutively).
For input sync, at clock edge a and b, ap_start is validated and de-asserted by the ap_ready signal, and triggers the kernel execution simultaneously. (This is somewhat similar to TVALID validated by TREADY in the AXI stream protocol.) The XRT scheduler detects the status of the ap_start signal, and asserts ap_start when the signal is low, meaning the kernel can accept a new task. The ap_ready signal is generated by the kernel, indicating its status.
For output sync, at clock edge c and d, ap_done is confirmed and de-asserted by the ap_continue signal, meaning the completion of one kernel job. When the XRT scheduler detects the ap_done signal has been asserted, XRT asserts ap_continue. Generally, this should be implemented as a self-clear signal, so that it only keeps one cycle.
From the waveform, we can see that before the ap_done signal was asserted, the kernel uses the ap_ready signal to tell the XRT that it can accept new input data. This scheme acts as back-pressure on the input sync stage to enable the task pipeline to fully utilize the hardware capability. In the above example waveform, XRT writes ap_start bit and ap_continue bit twice each in the AXI control slave register.
The following table lists all the control register and kernel arguments included in AXI slave port. There is no interrupt support in this kernel.
| Name | Addr Offset | Width (bits) | Description |
|---|---|---|---|
| CTRL | 0x000 | 5 | Control Signals. bit 0 - ap_start bit 1 - ap_done bit 2 - ap_idle bit 3 - ap_ready bit 4 - ap_continue |
| MODE | 0x010 | 1 | Kernel cipher mode: 0 - decryption 1 - encryption |
| IV_W3 | 0x018 | 32 | AES-CBC mode initial vector, Word 3 |
| IV_W2 | 0x020 | 32 | AES-CBC mode initial vector, Word 2 |
| IV_W1 | 0x028 | 32 | AES-CBC mode initial vector, Word 1 |
| IV_W0 | 0x030 | 32 | AES-CBC mode initial vector, Word 0 |
| WORDS_NUM | 0x038 | 32 | Number of 128-bit words to process |
| SRC_ADDR_0 | 0x040 | 32 | Input data buffer address, LSB |
| SRC_ADDR_1 | 0x044 | 32 | Input data buffer address, MSB |
| DEST_ADDR_0 | 0x048 | 32 | Output data buffer address, LSB |
| DEST_ADDR_1 | 0x04C | 32 | Output data buffer address, MSB |
| CBC_MODE | 0x050 | 1 | Cipher processing mode: 0 - AES-ECB mode 1 - AES-CBC mode |