AI Engine Compiler Options - 2023.1 English

AI Engine Tools and Flows User Guide (UG1076)

Document ID
UG1076
Release Date
2023-06-23
Version
2023.1 English
Table 1. AI Engine Options
Option Name Description
--constraints=<string> Constraints (location, bounding box, etc.) can be specified using a JSON file. This option lets you specify one or more constraint files.
--heapsize=<int> Heap size (in bytes) used by each AI Engine

The stack, heap, and sync buffer (32 bytes, includes the graph run iteration number information) are allocated up to 32768 bytes of data memory. The default heap size is set to 1024 bytes. Before changing the heap size to a different value, ensure that the sum of the stack, heap, and sync buffer sizes does not exceed 32768 bytes.

Used for allocating any remaining file-scoped data that is not explicitly connected in the user graph.

--stacksize=<int> Stack size (in bytes) used by each AI Engine

The stack, heap, and sync buffer (32 bytes) are allocated up to 32768 bytes of data memory. The default stack size is set to 1024 bytes. Before changing the stack size to a different value, ensure that the sum of the stack, heap, and sync buffer sizes does not exceed 32768 bytes.

Used as a standard compiler calling convention including stack-allocated local variables and register spilling.

--pl-freq=<value> Specifies the interface frequency (in MHz) for all PLIOs. The default frequency is a quarter of the AI Engine frequency and the maximum supported frequency is half of the AI Engine frequency. The PL frequency specific to each interface can be provided in the graph.
--pl-register-threshold=<value> Specifies the frequency (in MHz) threshold for registered AI Engine-PL crossings. The default frequency is one-eighth of the AI Engine frequency dependent on the specific device speed grade.
Note: Values above a quarter of the AI Engine array frequency are ignored, and a quarter is used instead.
Table 2. CDO Options
Option Name Description
--broadcast-enable-core Enables all AI Engines associated with a graph using broadcast. This option reserves one broadcast channel in the array for core enabling purpose. The default is true.
Table 3. Compiler Debugging Options
Option Name Description
--adf-api-log-level=<value> ADF API log-level. Available values are as follows:
  • 0: errors
  • 1: level-0 + warnings
  • 2: level-1 + info messages
  • 3: level-2 + debug messages

The default is 2.

--kernel-linting Performs consistency checking between graphs and kernels. The default is false.
--known-tripcount Converting unknown trip count to known trip count.
--quiet Suppresses the output of the AI Engine compiler.
--verbose Verbose output of the AI Engine compiler emits compiler messages at various stages of compilation. These debug and tracing logs provide useful messages regarding the compilation process.
Table 4. Design Rule Check Options
Option Name Description
--drc.disable=<string> Disables the Design Rule Check for the specified ID. A disabled check is not executed.
--drc.enable=<string> Enables the Design Rule Check for the specified ID.
--drc.severity=<string> Changes the severity of a Design Rule Check: format <ID>:<severity>[:context].
--drc.waive=<string> Waives the Design Rule Check for the specified ID. A waived check is still performed, but marked as waived.
Table 5. Execution Target Options
Option Name Description
--target=<hw|x86sim> The AI Engine compiler supports several build targets. The default is hw.
  • The hw target produces a libadf.a for use in the hardware device on a target platform and hardware emulation.
  • The x86sim target compiles the code for use in the x86 simulator as described in x86 Functional Simulator.
Table 6. File Options
Option Name Description
--include=<string> This option can be used to include additional directories in the include path for the compiler front-end processing.

Specify one or more include directories.

--output=<string> Specifies an output.json file that is produced by the front end for an input data flow graph file. The output file is passed to the back-end for mapping and code generation of the AI Engine device. This is ignored for other types of input.
--output-archive=<string> Specifies output archive name which will contain compiled AI Engine artifacts. The default is libadf.a.
--platform=<string>

This is a path to a Vitis platform file that defines the hardware and software components available when doing a hardware design and its RTL co-simulation. It can be a platform specification (XPFM) or a hardware specification (XSA).

--workdir=<string>

By default, the compiler writes all outputs to a sub-directory of the current directory, called Work. Use this option to specify a different output directory.

Table 7. Generic Options
Option Name Description
--help Lists the available AI Engine compiler options, sorted in the groups listed here.
--help-list Displays an alphabetic list of AI Engine compiler options.
--version Displays the version of the AI Engine compiler.
Table 8. Miscellaneous Options
Option Name Description
--disable-multirate This option disables multirate in ADF graphs. The default is false.
--evaluate-fifo-depth This option analyzes re-convergent data paths. Data might be sent on multiple paths and sometimes they can re-converge which can result in a deadlock. Such deadlock can be resolved by adding FIFOs to the appropriate data paths.

After the design is compiled with this option, the design should be simulated using the aiesimulator. The aiesimulator estimates FIFO depth on relevant nets. This information can be used to set the FIFO depth on specific data paths on the graph.

--no-init This option disables initialization of window buffers in AI Engine data memory. This option enables faster loading of the binary images into the SystemC-RTL co-simulation framework. The default is false.
Tip: This does not affect the statically initialized lookup tables.
--nodot-graph By default, the AI Engine compiler produces .dot and .png files to visualize the user-specified graph and its partitioning onto the AI Engines. This option can be used to eliminate the dot graph output. The default is false.
Table 9. Module Specific Options
Option Name Description
--Xchess=<string> Can be used to pass kernel specific options to the CHESS compiler that is used to compile code for each AI Engine.

The option string is specified as <kernel-function>:<optionid>=<value>. This option string is included during compilation of generated source files on the AI Engine where the specified kernel function is mapped.

--Xelfgen=<string> Can be used to pass additional command-line options to the ELF generation phase of the compiler, which is currently run as a make command to build all AI Engine ELF files.

For example, to limit the number of parallel compilations to four, you write -Xelfgen="-j4".

Note: If during compilation you see errors with bad_alloc in the log, or if the Vitis IDE crashes, this could be due to insufficient memory on your workstation. A possible workaround (other than increasing the available memory on your machine) is to limit the parallelism used by the compiler during code generation phase. This can be specified in the GUI as the compiler CodeGen option -j1 or -j2, or on the command line as -Xelfgen=-j1 or -Xelfgen=-j2.
--Xmapper=<string> Can be used to pass additional command-line options to the mapper phase of the compiler. For example:
--Xmapper=DisableFloorplanning

These are options to try when the design is either failing to converge in the mapping or routing phase, or when you are trying to achieve better performance via reduction in memory bank conflict.

See the Mapper and Router Options for a list and description of options.

--Xpreproc=<string> Passes general option to the PREPROCESSOR phase for all source code compilations (AIE/PS/PL/x86sim). For example:
--Xpreproc=-D<var>=<value>
--Xpslinker=<string> Passes general option to the PS LINKER phase. For example:
--Xpslinker=-L<libpath> -l<libname>
--Xrouter=<string> Passes general option to the ROUTER phase. For example:
-Xrouter=dmaFIFOsInFreeBankOnly
--fast-floats Enables fast implementation for linear floating point scalar operations like add, sub, mul, and compare.
--fast-nonlinearfloats Enables fast implementation for non-linear floating point scalar operations like sine/cosine, sqrt, and inv.
--fastmath Enables fast implementations of float2fix, fplt and fpge.
Note: Only AI Engine kernels that have been modified are recompiled in subsequent compilations of the AI Engine graph. Any un-modified kernels will not be recompiled.
Table 10. Event Trace Options
Option Name Description
--event-trace=<value>

where <value> is one of the following:

  • functions
  • functions_partial_stalls
  • functions_all_stalls
  • runtime
Event trace configuration value. Where the specified <value> indicates the following:
  • Function transition view without stalls.
  • Function transition view with stream/lock/cascade stalls.
  • Function transition view with all stalls (stream/lock/cascade/memory).
  • Run-time event tracing configuration.
--event-trace-port=<value>
  • plio
  • gmio
Sets the AI Engine event tracing port. The default value is gmio; AMD recommends that you use gmio as the event-trace-port configuration. See Event Trace Build Flow for more information.
  • Set the AI Engine event tracing port to plio
  • Set the AI Engine event tracing port to gmio
--num-trace-streams=<int> Number of trace streams. The default is 16.
--trace-plio-width=<int> PLIO width for trace streams. The default is 64. Allowed values are 32 and 64.
--graph-iterator-event Generates the user event0() whenever the graph iterator is incremented. This enables the capability to delay the start of the hardware event trace based on the graph iteration.
Table 11. Optimization Options
Option Name Description
--xlopt=<int> Enables a combination of kernel optimizations based on the opt level. Allowable values are 0 to 2; the default is 1.
  • xlopt=1
    • Automatic computation of heap size: Enables ease of use using kernel analysis to automatically compute the heap requirements for each AI Engine. Therefore, you do not need to specify the heap size.
    • Guidance: Guidance is provided to highlight unaligned variables, global arrays that can potentially be mapper allocated, improper usage of restrict, and potential read before write conflicts.
    • Pragma insertion: Automatically infer and insert pragmas in kernel code.
  • xlopt=2
    • Automatic inline: Automatically inlining functions if it is practical and possible to do so, even if the functions are not declared as __inline or inline.
    • Loop peeling for unrolled loops: Make loop iteration count a multiple of the unrolling factor via peeling. Split a loop into multiple loops based on its iteration count and profitability heuristics, and add flattening pragma on the split loops.
Note: Compiler optimization (xlopt > 0) reduces debug visibility.
--Xxloptstr=<string> Option string to enable/disable optimizations in xlopt level 1 and 2.
  • -annotate-pragma=false: turns off automatic insertion off loop pragmas
  • -xlinline-threshold=T: set the automatic inlining threshold to T (default T = 5000)
  • -annotate-pragma: automatic insertion of loop unrolling, pipelining, and flattening pragma (default = true)
Note: Two reserved words, aie and adf, are not valid namespace identifiers in graph programming.
Note: Function names defined in the AI Engine graph and kernel code should not be identical to function names from standard c++ library. aiecompiler will issue an error message when such functions are used, since they conflict with predefined function names.