Custom OP Workflow - 3.5 English

Vitis AI User Guide (UG1414)

Document ID
Release Date
3.5 English

Starting with version 2.5, Vitis AI supports Pytorch and Tensorflow2 models with custom op. The basic workflow for custom op is shown below.

Figure 1. Custom Op Workflow

The following are the steps in the workflow:

  1. Define the OP as a custom OP unknown to XIR and then quantize the model.
  2. Compile the quantized model.
  3. Register and implement the custom OP.
  4. Deploy the model with graph_runner APIs.
Step 3 supports C++ and Python to implement and register the custom OP. There are more than 50 supported common OPs by the Vitis AI library. You can find the source code of the common OPs in
Note: If you want to implement an accelerated (PL or AI Engine) function for a custom op, make it a CPU OP, but implement the PL/AI Engine calling codes in this CPU OP's implementation.

For step 4, graph_runner APIs support both C++ and Python. When using the Graph_runner API to deploy Custom OP, its runtime has been optimized, including Zero-copy technology between different DPU OPs and CPU OPs. It means address sharing between different layers without data copying.

The following model structure is supported by Zero copy.

Table 1. Model structure supported by Zero copy
Type Output of OP Input of OP Using Zero copy
a Single dpu OP Single cpu OP Yes
b Single cpu OP Single dpu OP Yes
c Single cpu OP Single cpu OP Yes
d Single dpu OP Multiple cpu OP Yes
e Multiple cpu OP and multiple dpu OP Single cpu OP Yes
Note: Model structure types a-e are shown in the following figure.
Figure 2. Model Structure Types
Note: The application of Zero copy for the other model structures depends on the situation.

The following are examples of the two models, respectively.

  • MNIST model based on Tensorflow2
  • Pointpillars model based on Pytorch