C Simulation with Arrays - 2024.2 English - UG1399

Vitis High-Level Synthesis User Guide (UG1399)

Document ID
Release Date
2024.2 English

Arrays can introduce issues during C/C++ simulation, even before the synthesis step is performed. If you specify a very large array, it might cause the C/C++ simulation to run out of memory and fail, as shown in the following example:

#include "ap_int.h"
  int i, acc;
  // Use an arbitrary precision type
  ap_int<32>  la0[10000000], la1[10000000];
  for (i=0 ; i < 10000000; i++) {
      acc = acc + la0[i] + la1[i];

The simulation might fail by running out of memory, because the array is placed on the stack that exists in memory rather than the heap that is managed by the OS and can use local disk space to grow. Certain issues might make this issue more likely:

  • On PCs, the available memory is often less than large Linux boxes and there might be less memory available.
  • Using arbitrary precision types as shown in the example above could make this issue worse as they require more memory to model than standard C/C++ types.
  • Using the more complex fixed-point arbitrary precision types found in C++ might make the issue of designs running out of memory even more likely as types require even more memory.

The standard way to improve memory resources in C/C++ code development is to increase the size of the stack using the linker options such as the following option which explicitly sets the stack size syn.csimflags -z stack-size=10485760.

However, the machine might not have enough available memory, and increasing the stack size will not help. In this case a solution is to use dynamic memory allocation for simulation but a fixed-sized array for synthesis, as shown in the next example. This means that the memory required for this is allocated on the heap, managed by the OS, and can use local disk space to grow.

#include "ap_int.h"
  int i, acc;
#ifdef __SYNTHESIS__
  // Use an arbitrary precision type & array for synthesis
  ap_int<32>  la0[10000000], la1[10000000];
  // Use an arbitrary precision type & dynamic memory for simulation
 ap_int<32> *la0 = malloc(10000000  * sizeof(ap_int<32>));
 ap_int<32> *la1 = malloc(10000000  * sizeof(ap_int<32>));
  for (i=0 ; i < 10000000; i++) {
      acc = acc + la0[i] + la1[i];

However, this is not an ideal solution because the simulated code and the synthesized code are not the same. But this might be the only way to complete simulation. If you take this approach be sure that the C/C++ test bench covers all aspects of accessing the array. The RTL simulation performed by cosim_design will verify that the memory accesses are correct in the synthesized code.

Note: Only use the __SYNTHESIS__ macro on the code to be synthesized. Do not use this macro in the test bench, because it has no significance in the C/C++ simulation or C/C++ RTL co-simulation. Refer to Vitis-HLS-Introductory-Examples/Pipelining/Functions/hier_func for the full version of this example.