Using a functional model of the AMD Vitis™ HLS
kernel during hardware emulation is an advanced use case that enables compilation of
kernels in functional mode that generates the XO with a SystemC wrapper around the C
code. An IP, whether generated by the HLS compiler or not, can have multiple types
of simulation models, such as TLM and RTL, as indicated by the
allowed_sim_models
property. However, the IP needs to indicate
which of these models is the current model as defined by the
selected_sim_model
property. The method described here lets you
specify which type of sim model you want to be selected when the HLS compiler
generates the output IP or XO.
HW Emulation is mainly targeted for hardware kernel debug with detailed, cycle-accurate view of kernel activity. The functional (TLM) model speeds up emulation by compiling the kernel of interest in functional mode rather than as RTL code, and can be used as an early model when the RTL is not yet available. This provides faster compile time for the kernel as it does not need full C to RTL synthesis, and faster execution time as C-code is simulated instead of RTL simulation. You can also mix and match of C and RTL kernels in hardware emulation for faster debug of RTL blocks.
The functional model feature supports modeling AXI4-Stream interfaces (axis
) and
AXI4 memory-mapped interfaces (m_axi
), in addition to register reads and writes of the
AXI4-Lite (s_axilite
) interfaces. However, with this approach, the kernel will be
purely functional without latency information, unlike cycle-accurate models.
The user HLS function is wrapped into a SystemC module with TLM interfaces and IP is created out of the generated code which will allow generating HW_EMU compatible XO that can be used in IP integrator for stitching v++ link designs in HW Emulation flows. This also allows the Wrapper IP to talk to other RTL and SystemC models. So, the HLS C/C++ kernels compiled in functional mode will have TLM transactions during simulation and users can see traffic between the memory models (for example DDR memory) and the TLM kernels.
XO Generation with Functional Model
v++ -c
--mode hls
command does not support the functional simulation model as
described below. To generate an IP or kernel to use the TLM model for functional
simulation you must use the v++ -c -t=hw_emu
command. During the v++ -c -t=hw_emu
compile step,
while creating the hardware emulation (hw_emu
) XO files, you can
provide an option enabling a functional simulation model for the PL kernel that will
generate the XO with a SystemC wrapper on the C code. You need to provide the
--advanced.param compiler.emulationMode=func
option during compilation, as described in --advanced Options.
The default setting for this is compiler.emulationMode=rtl
. When building the XO you can either
provide the default value using --advanced.param
compiler.emulationMode=rtl
so you can simply toggle between RTL and TLM
models for a specific XO; or you can remove the --advanced_param
command to restore the default value and add it back
when building for functional simulation. In either case, if you want to change the
model from RTL to TLM or back, you must recompile the XO using the v++ -c -t=hw_emu
command.
The generated functional simulation XO is linked using the v++ --link
command like the regular XO.
Limitations of the Functional Model
- The functional mode is not supported on Windows OS.
- Limitations in HLS are applied "as is." For example, HLS does not support double pointers so the functional model does not identify it.
- HLS designs which operate on multiple data iteration from host with single
kernel
ap_start
(for exampleap_ctrl_chain
) might not operate if the restart is triggered from the kernel code. Mailboxing works fine. - Application Binary Interface (ABI) changes for FPGA are not available in Functional Mode x86 ABI. For most optimizations where is ABI is used, they need to be disabled in functional compiler.
-
Limiting DDR Analysis by Casting/Inter procedural uses:
-
Typecasting DDR memory pointers from scalars will not work.
kernel void vadd(size_t a_s,size_t b_s,size_t c){ int* a = (size_t)a; int* b = (size_t)b; int* c = (size_t)c; for(int i=0; i < 64; i++){ c[i] = a[i] + b[i]; } }
-
Caching DDR memory pointers across procedural context will not work.
class Cache{ int* local; Cache(int *a) : local(a){} int read(){} void write(int x){} }; kernel void vadd(int *a,int *b, int *c){ Cache ca(a); for(int i=0; i < 64; i++){ c[i] = ca.read() + b[i]; } }
-
- HLS features implemented in binary and consuming DDR memory access are not supported and require functional rewrite.
- Burst transactions are not automatically detected in the functional model.
Coding guidelines for working with functional models: For kernel compute units that run multiple times and expect static value reset to zero in each iteration you must initialize all static variables at the entry of the kernel function. The following example shows code that returns an error and also demonstrates the recommended approach:
// User code that errors out
static int i = 0;
void hls_kernel_logic(...) {
...
}
// Recommended
static int i = 0;
void hls_kernel_logic(...) {
i = 0;
...
}