Core C++ Classes - 3.5 English

Vitis AI User Guide (UG1414)

Document ID
UG1414
Release Date
2023-09-28
Version
3.5 English

wego_torch::core::CompileOptions

A C++ class object that specifies the compilation options for WeGO-Torch. You need to create an instance of this class and pass it as an argument to the wego_torch::core::Compile function.
Table 1. Constructor Parameters
Type Name Description
wego_torch::AccuracyMode accuracy_mode Determines the accuracy mode. It has two different types:
  • wego_torch::AccuracyMode::kDefaultRemoveFixNeuron: In this mode, WeGO-Torch optimizes performance by eliminating redundant fixneurons during the compilation process. These fixneurons can exist due to quantized operators, but they are not supported by the DPU target onboard, causing them to be dispatched to the CPU for inference. By removing these fixneurons from the model, the end-to-end performance can be enhanced, provided that the accuracy meets the requirements
  • wego_torch::AccuracyMode::kReserveFixNeuron: If this value is provided, WeGO-Torch retains all redundant fixneurons in the model instead of removing them. While removing these fixneurons enhances performance, there is a chance of accuracy issues in certain cases. It is recommended to experiment with this value if the end-to-end accuracy falls short of the requirement after compiling the model with WeGO-Torch.
wego_torch::core::PartitionOptions partition_options Sets partition options. See wego_torch::core::PartitionOptions class for more detail.
std::vector<InputMeta> inputs_meta A vector of wego_torch::InputMeta for each input of the model.
uint32_t thread_parallel Parameter to optimize performance.
uint32_t core_parallel Parameter to optimize performance.
wego_torch::core::DebugOptions debug_options Sets debug options. See wego_torch::core::DebugOptions class for more detail.

wego_torch::core::PartitionOptions

Options for WeGO partiton configuration.
Table 2. Constructor Parameters
Type Name Description
uint32_t wego_subgraph_min_ops_number

Currently, WeGO uses a greedy method to allocate operators to the DPU as long as they are compatible. However, this approach can give rise to the following issues:

  • If the DPU does not support many operators, the model might end up being partitioned into numerous DPU subgraphs and CPU subgraphs. When each DPU subgraph contains only a small number of operators, executing these subgraphs on the DPU might lead to performance problems due to frequent memory transfers between the host and the device.
  • WeGO assigns a device buffer for each DPU subgraph. In cases where the model is large and there are multiple DPU subgraphs after partitioning, there is a possibility of buffer overflow issues.
To address this, the following option is added: wego_subgraph_min_ops_number. It sets a minimum number of operators for a DPU subgraph to be executed on DPU. Otherwise, it is executed on the CPU even if DPU supports all operators.
Note: wego_subgraph_min_ops_number = 0 means no limit.
std::vector<std::string> extra_accel_op_list DPU can run various DL operators with some constraints (For example, DPUCVDX8H_ISA1_F2W4_4PE only supports convolution with kernel 1-16 and stride 1-4). WeGO uses a DPU limitation check engine to partition operators based on their DPU compatibility. However, some operators have complex compatibility rules that might cause too much overhead. WeGO does not dispatch them to DPU by default, but lets you specify the operators you want to accelerate in the extra_accel_op_list. The following operators can be added to the extra_accel_op_list for DPU execution:
  • aten::mul
  • aten::mean
  • aten::linear
  • aten::unsqueeze
  • aten::slice
Note: When WeGO encounters errors while compiling the operator(s) listed in the extra_accel_op_list, it indicates that DPU cannot accelerate this specific operator (s). However, if no errors are reported, these operator(s) can indeed be accelerated successfully.

wego_torch::core::DebugOptions

Option for WeGO debugging.
Table 3. Constructor Parameters
Type Name Description
bool accuracy_debug To enable the dumping of input and output values for subgraphs with accuracy issues, set the value to true. By doing so, the inputs and outputs of these subgraphs are logged. The default value is false.

wego_torch::InputMeta

Meta Information for describing inputs of the quantized model. Due to the limitations of Vitis AI toolchain, WeGO-Torch only supports compilation with static type and shape. You must explicitly pass each input's date type and shape information to enable WeGO-Torch for type and shape inference.
Table 4. Constructor Parameters
Type Name Description
wego_torch::DataType type_ Data type of the current input tensor. It can be:wego_torch::DataType::kBool, wego_torch::DataType::kInt32 or wego_torch::DataType::kFloat32.
wego_torch::ShapeType input_shape_ Input shape of the current input tensor.

wego_torch::TargetInfo

A C++ class object that serves as a wrapper for DPU target information, providing access to the batch, name, fingerprint, and fingerprint-driven information of the DPU target on-board.
Table 5. Constructor Parameters
Type Name Description
std::string name Name of the DPU target
uint64_t fingerprint Fingerprint of the DPU target on-board
bool is_fingerprint_driven Indicates whether DPU subgraphs are compiled based on fingerprint or target name.
uint32_t batch Batch size supported by the DPU target on-board