Custom OP Registration - 3.5 English

Vitis AI User Guide (UG1414)

Document ID
Release Date
3.5 English
Before custom op registration, you can use the latest Netron program to check the compiled model. From the following graph, PPScatter is assigned to the CPU. You have to implement and register PPScatter op.
Figure 1. PPScatter Op in CPU Subgraph


  1. Use Netron to open the compiled model and find the custom op in the CPU subgraph with op information.
    Figure 2. The Inputs and Outputs of PPScatter Op

    You can observe from the previous model structure image that the operation type (op type) is PPScatterV2. PPScatterV2 represents the name of the custom operation (op) to create.

    To obtain detailed information about the custom op, you can use the xdputil tool. Run the following command to check the custom_layer op:
    xdputil xmodel pointpillars_custom_op.xmodel --op VoxelNet__VoxelNet_input_4
  2. Write your own implementation of this op.

    Custom op registration supports both C++ and Python. The following steps show how to implement the op in C++. For the op Python implementation, refer to Vitis-AI/examples/custom_operator/pytorch_example/op_registration/python/

    Note: In Vitis-AI/examples/custom_operator/op_add you can find a '' file that comprehensively describes the procedure to implement the custom op. For guidance and instructions on implementing and registering the custom op, refer to this '' file.
    1. Create the my_PPScatter_op.cpp source file and place it in the new folder, op_PPScatter.

      You can also copy one existing op and rename to your op, and then rename my_tanh_op.cpp to my_PPScatter_op.cpp.

      cp - r Vitis-AI/src/vai_library/cpu_task/examples/op_tanh/  op_PPScatter 
    2. Create the Makefile.
      OUTPUT_DIR = $(PWD)
      all: $(OUTPUT_DIR) $(OUTPUT_DIR)/
      mkdir -p $@
      $(OUTPUT_DIR)/my_PPScatter_op.o: my_PPScatter_op.cpp
      $(CXX) -std=c++17 -fPIC -c -o $@ -I. -I=/install/Debug/include -Wall -U_FORTIFY_SOURCE -D_FORTIFY_SOURCE=0 $<
      $(OUTPUT_DIR)/ $(OUTPUT_DIR)/my_PPScatter_op.o
      $(CXX) -Wl,--no-undefined -shared -o $@ $+ -L=/install/Debug/lib -lglog -lvitis_ai_library-runner_helper -lvart-runner -lxir
    3. Write the implementation of the op.

      In the my_PPScatter_op.cpp file, use the constructor function to initialize any required variables. In this example, no variables need to be initialized.

      Implement your custom logic in the 'calculate()' function. The primary objective of this logic is to retrieve input data from the inputs variable, perform the necessary calculations, and write the output data to the output variable.

      In the calculate() function, implementation your own logic. The logic is mainly getting input data from “inputs” variable, calculating the logic, writing output data to the “output” variable.

      The following is the code for my_PPScatter_op.cpp
      #include <vart/op_imp.h> 
      class MyPPScatterOp {
        MyPPScatterOp(const xir::Op* op1, xir::Attrs* attrs) : op{op1} {
        // op and attrs is not in use.
      int calculate(vart::simple_tensor_buffer_t output,
                     std::vector<vart::simple_tensor_buffer_t<float>> inputs) {
        CHECK_EQ(inputs.size(), 2);
        auto input_data_shape = inputs[0].tensor->get_shape();
        auto input_coord_shape = inputs[1].tensor->get_shape();
        auto output_shape = output.tensor->get_shape();
        CHECK_EQ(input_data_shape.size(), 4); // 1 12000 1 64 --> 1 64 12000 1
        CHECK_EQ(input_coord_shape.size(), 3); // 1 12000 4
        CHECK_EQ(output_shape.size(), 4); // 1 496 432 64 ---> 1 64 496 432
        auto coord_numbers = input_coord_shape[1];
        auto coord_channel = input_coord_shape[2];
        CHECK_EQ(coord_numbers, input_data_shape[2]);
        auto batch = output_shape[0];
        auto height = output_shape[2];
        auto width = output_shape[3];
        auto channel = output_shape[1];
        CHECK_EQ(input_data_shape[0], batch);
        CHECK_EQ(channel, input_data_shape[1]);
        auto output_idx = 0;
        auto input_idx = 0;
        auto x_idx = 0;
        memset(, 0, output_shape[0]*output_shape[1]*output_shape[2]*output_shape[3]*sizeof(float));
        for (auto n = 0; n < coord_numbers; n++) {
          auto x = (int)inputs[1].data[x_idx + 3];
          auto y = (int)inputs[1].data[x_idx + 2];
          if (x < 0) break; // stop copy data when coord x == -1 .
          for(int i=0; i < channel; i++) {
          output_idx =i*height*width + y*width+x;
          input_idx = n+i*coord_numbers;
[output_idx] = inputs[0].data[ input_idx ];
          x_idx += coord_channel;
        return 0;
        const xir::Op* const op;
    4. Build the library. The target directory is $(HOME)/build/custom_op/ . You can modify the path in Makefile.

      When executing the make command using your provided Makefile, the custom-defined op library is generated in the following directory: $(HOME)/build/custom_op/.

      The name of the file will be ''.

    5. Copy the to /usr/lib on the target.
  3. Verify the Op on the target.
    1. Use run_op command in xdputil to test the op:
      xdputil run_op pointpillars_op.xmodel VoxelNet__VoxelNet_input_4 -r ref -d dump

      Before running the command, prepare the reference inputs of the op. After you run the command, VoxelNet__VoxelNet_input_4.bin file is generated.

    2. Compare the output with the golden file:
       xdputil comp_float ref/VoxelNet__VoxelNet_input_4.bin dump/VoxelNet__VoxelNet_input_4.bin
      If the op implementation is successful, you will see the following result: