Core C++ Classes - 3.5 English

wego_torch::core::CompileOptions

A C++ class object that specifies the compilation options for WeGO-Torch. You need to create an instance of this class and pass it as an argument to the wego_torch::core::Compile function.

Table 1. Constructor Parameters
Type	Name	Description
wego_torch::AccuracyMode	accuracy_mode	Determines the accuracy mode. It has two different types: wego_torch::AccuracyMode::kDefaultRemoveFixNeuron: In this mode, WeGO-Torch optimizes performance by eliminating redundant fixneurons during the compilation process. These fixneurons can exist due to quantized operators, but they are not supported by the DPU target onboard, causing them to be dispatched to the CPU for inference. By removing these fixneurons from the model, the end-to-end performance can be enhanced, provided that the accuracy meets the requirements wego_torch::AccuracyMode::kReserveFixNeuron: If this value is provided, WeGO-Torch retains all redundant fixneurons in the model instead of removing them. While removing these fixneurons enhances performance, there is a chance of accuracy issues in certain cases. It is recommended to experiment with this value if the end-to-end accuracy falls short of the requirement after compiling the model with WeGO-Torch.
wego_torch::core::PartitionOptions	partition_options	Sets partition options. See wego_torch::core::PartitionOptions class for more detail.
std::vector<InputMeta>	inputs_meta	A vector of wego_torch::InputMeta for each input of the model.
uint32_t	thread_parallel	Parameter to optimize performance.
uint32_t	core_parallel	Parameter to optimize performance.
wego_torch::core::DebugOptions	debug_options	Sets debug options. See wego_torch::core::DebugOptions class for more detail.

wego_torch::core::PartitionOptions

Options for WeGO partiton configuration.

Table 2. Constructor Parameters
Type	Name	Description
uint32_t	wego_subgraph_min_ops_number	Currently, WeGO uses a greedy method to allocate operators to the DPU as long as they are compatible. However, this approach can give rise to the following issues: If the DPU does not support many operators, the model might end up being partitioned into numerous DPU subgraphs and CPU subgraphs. When each DPU subgraph contains only a small number of operators, executing these subgraphs on the DPU might lead to performance problems due to frequent memory transfers between the host and the device. WeGO assigns a device buffer for each DPU subgraph. In cases where the model is large and there are multiple DPU subgraphs after partitioning, there is a possibility of buffer overflow issues. To address this, the following option is added: wego_subgraph_min_ops_number. It sets a minimum number of operators for a DPU subgraph to be executed on DPU. Otherwise, it is executed on the CPU even if DPU supports all operators. Note: wego_subgraph_min_ops_number = 0 means no limit.
std::vector<std::string>	extra_accel_op_list	DPU can run various DL operators with some constraints (For example, DPUCVDX8H_ISA1_F2W4_4PE only supports convolution with kernel 1-16 and stride 1-4). WeGO uses a DPU limitation check engine to partition operators based on their DPU compatibility. However, some operators have complex compatibility rules that might cause too much overhead. WeGO does not dispatch them to DPU by default, but lets you specify the operators you want to accelerate in the extra_accel_op_list. The following operators can be added to the extra_accel_op_list for DPU execution: aten::mul aten::mean aten::linear aten::unsqueeze aten::slice Note: When WeGO encounters errors while compiling the operator(s) listed in the extra_accel_op_list, it indicates that DPU cannot accelerate this specific operator (s). However, if no errors are reported, these operator(s) can indeed be accelerated successfully.

wego_torch::core::DebugOptions

Option for WeGO debugging.

Table 3. Constructor Parameters
Type	Name	Description
bool	accuracy_debug	To enable the dumping of input and output values for subgraphs with accuracy issues, set the value to true. By doing so, the inputs and outputs of these subgraphs are logged. The default value is false.

wego_torch::InputMeta

Meta Information for describing inputs of the quantized model. Due to the limitations of Vitis AI toolchain, WeGO-Torch only supports compilation with static type and shape. You must explicitly pass each input's date type and shape information to enable WeGO-Torch for type and shape inference.

Table 4. Constructor Parameters
Type	Name	Description
wego_torch::DataType	type_	Data type of the current input tensor. It can be:wego_torch::DataType::kBool, wego_torch::DataType::kInt32 or wego_torch::DataType::kFloat32.
wego_torch::ShapeType	input_shape_	Input shape of the current input tensor.

wego_torch::TargetInfo

A C++ class object that serves as a wrapper for DPU target information, providing access to the batch, name, fingerprint, and fingerprint-driven information of the DPU target on-board.

Table 5. Constructor Parameters
Type	Name	Description
std::string	name	Name of the DPU target
uint64_t	fingerprint	Fingerprint of the DPU target on-board
bool	is_fingerprint_driven	Indicates whether DPU subgraphs are compiled based on fingerprint or target name.
uint32_t	batch	Batch size supported by the DPU target on-board