An HLS function first and foremost is a function. It has a predetermined
number of inputs and outputs and every time the function is invoked, it consumes the
inputs and produces the predetermined number of outputs. If an HLS function imported
using xmcImportFunction
hangs, (for example, if it has
an infinite loop), Simulink will also hang, waiting
indefinitely for the output from the imported block. This is because an imported HLS
function using xmcImportFunction
runs on the same
thread as Simulink. If the imported functions hangs,
Simulink also hangs.
HLS Kernels are IPs
When you import an HLS function into a design by itself, the HLS function will not operate as an IP with streaming ports. In Vitis Model Composer, you need to enclose the HLS function in a subsystem (perhaps along with other HLS blocks) and use the Interface Spec block to designate streaming ports for the design. You can then generate an HLS IP. Unlike an imported HLS function, an HLS Kernel is a proper HLS IP that can be used in the Vitis™ HLS and be synthesized directly. The following code snippet highlights the HLS kernel code with streaming interface.
hls_kernel.cc
void hls_kernel_blk(
hls::stream<ap_axis<64, 0, 0, 0> > & in_sample1,
hls::stream<ap_axis<64, 0, 0, 0> > & in_sample2,
hls::stream<ap_axis<64, 0, 0, 0> > & out0_itr1,
hls::stream<ap_axis<64, 0, 0, 0> > & out1_itr1
)
{
#pragma HLS PIPELINE II=1
#pragma HLS INTERFACE ap_ctrl_none port=return
#pragma HLS INTERFACE axis register both port=out1_itr1
#pragma HLS INTERFACE axis register both port=out0_itr1
#pragma HLS INTERFACE axis register both port=in_sample1
#pragma HLS INTERFACE axis register both port=in_sample2
ap_int64 in_samp0 ; // Iteration-1: 2 complex samples concatenated to 64-bit
ap_int64 in_samp1 ; // Iteration-2: 2 complex samples concatenated to 64-bit
...
In this example, notice the function signature and also the HLS pragmas specifying the interface on the ports. This function has all the constructs required by the HLS IP. You can directly import the code above into Model Composer using the HLS Kernel block, and then simulate it.
An HLS Kernel block can accept variable size signals allowing it to
connect to AI Engine blocks and also produces variable size
output signals. Unlike a block that is imported using
xmcImportFunction
, the HLS Kernel block runs on a separate
thread than Simulink and as such, the presence of a blocking read call in the HLS
kernel code will not cause Simulink to hang when the input variable size signal is
empty, instead the block will also produce an empty variable size output signal.