Writing OP Implementation in C++

Writing OP Implementation in C++ - 3.5 English

Vitis AI Library User Guide (UG1354)

Document ID

UG1354

Release Date

2023-06-29

Version

3.5 English

In my_add_op.cpp, create a C++ class. There are no requirements for naming the source file or class.
```
// in my_add_op.cpp
class MyAddOp {
};
```
Write the constructor function as shown in the following code snippet.
```
#include <vart/op_imp.h>

class MyAddOp {
    MyAddOp(const xir::Op* op1, xir::Attrs* attrs) : op{op1} {
      // op and attrs is not in use.
}
public:
    const xir::Op * const op;
};
```
Note: MyAddOp must have a public member variable named op. op is initialized with the first input argument of the constructor function, for example, op1. This is required for DEF_XIR_OP_IMP.

Write the member function, calculate, as shown in the following code snippet.

class MyAddOp {
  ...
  int calculate(vart::simple_tensor_buffer_t output,
                std::vector<vart::simple_tensor_buffer_t<float>> inputs) {
    for (auto i = 0u; i < output.mem_size / sizeof(float); ++i) {
      output.data[i] = 0.0f;
      for (auto input : inputs) {
        output.data[i] = output.data[i] + input.data[i];
      }
    }
    return 0;
  }
...
}

Compile the source file.
```
% g++ -fPIC -std=c++17 -c -o  /tmp/my_add_op.o -Wall -Werror -I ~/.local/Ubuntu.18.04.x86_64.Debug/include/ my_add_op.cpp
```
Note: Use C++ 17 or above. To build a shared library, enable -fPIC. It is assumed that the Vitis AI Library is installed at ~/.local/Ubuntu.18.04.x86_64.Debug.

To link to a shared library, use the following code.

% mkdir -p /tmp/lib;
% g++ -Wl,--no-undefined -shared -o /tmp/lib/libvart_op_imp_add.so /tmp/my_add_op.o -L ~/.local/Ubuntu.18.04.x86_64.Debug/lib -lglog -lvitis_ai_library-runner_helper -lvart-runner -lxir

You can also use Makefile to compile and link the library. An example Makefile is shown in the following code snippet.

OUTPUT_DIR = $(HOME)/build/customer_op

all: $(OUTPUT_DIR) $(OUTPUT_DIR)/libvart_op_imp_add.so

$(OUTPUT_DIR):
    mkdir -p $@

$(OUTPUT_DIR)/my_add_op.o: my_add_op.cpp
    $(CXX) -std=c++17 -fPIC -c -o $@ -I. -I=/install/Debug/include -Wall  -U_FORTIFY_SOURCE -D_FORTIFY_SOURCE=0 $<

$(OUTPUT_DIR)/libvart_op_imp_add.so:  $(OUTPUT_DIR)/my_add_op.o
    $(CXX) -Wl,--no-undefined -shared -o $@ $+ -L=/install/Debug/lib  -lglog -lvitis_ai_library-runner_helper -lvart-runner -lxir

To test the op implementation, create a sample XIR graph first as below.

% ipython;
import xir
g = xir.Graph("simple_graph")
a = g.create_op("a", "data", {"shape": [1,2,2,4], "data_type": "FLOAT32"});
b = g.create_op("b", "data", {"shape": [1,2,2,4], "data_type": "FLOAT32"});
add = g.create_op("add_op", "add",  {"shape": [1,2,2,4], "data_type": "FLOAT32"}, {"input": [a,b]})
root = g.get_root_subgraph()
root.create_child_subgraph()
user_subgraph = root.merge_children(set([g.get_leaf_subgraph(a), g.get_leaf_subgraph(b)]))
cpu_subgraph = root.merge_children(set([g.get_leaf_subgraph(add)]))
user_subgraph.set_attr("device", "USER")
cpu_subgraph.set_attr("device", "CPU")
g.serialize("/tmp/add.xmodel")

Recommended: Instead of writing complex Python codes, create an xmodel using the Xcompiler. For more information, refer to the Vitis AI User Guide (UG1414).

Create a sample input file.

% cd /tmp
% mkdir -p ref
% ipython
import numpy as np
a = np.arange(1, 17, dtype=np.float32)
b = np.arange(1, 17, dtype=np.float32)
a.tofile("ref/a.bin")
b.tofile("ref/b.bin")
c = a + b
c.tofile("ref/c.bin")

% cd /tmp
% mkdir -p /tmp/dump
% env LD_LIBRARY_PATH=$HOME/.local/Ubuntu.18.04.x86_64.Debug/lib:/tmp/lib $HOME/.local/Ubuntu.18.04.x86_64.Debug/share/vitis_ai_library/test/cpu_task/test_op_imp --graph /tmp/add.xmodel --op "add_op"

Note: Add /tmp/lib into the search path LD_LIBRARY_PATH so that the CPU runner can find the shared library you wrote.

Important: The name of the shared library must be libvart_op_imp_<YOUR_OP_TYPE>.so. The CPU runner uses this naming scheme to find the customized xir::Op implementation.

You can also use xdputil run_op to verify the op:

root@xilinx-zcu102-2021_2:~/add_op# xdputil run_op add.xmodel add_op -r ref -d dump
WARNING: Logging before InitGoogleLogging() is written to STDERR
I1202 09:32:41.497661  1208 test_op_run.cpp:79] try to test op: add_op
I1202 09:32:41.497745  1208 test_op_run.cpp:97]  input op: a tensor: a
I1202 09:32:41.497768  1208 test_op_run.cpp:97]  input op: b tensor: b
I1202 09:32:41.497865  1208 test_op_run.cpp:55] read ref/a.bin to 0xaaab17d605d0 size=64
I1202 09:32:41.497917  1208 test_op_run.cpp:55] read ref/b.bin to 0xaaab17c549b0 size=64
I1202 09:32:41.498561  1208 test_op_run.cpp:114] graph name:simple_graphtesting op: {
    {args: input= TensorBuffer{@0xaaab17ba9b90,tensor=xir::Tensor{name = a, type = FLOAT32, shape = {1, 2, 2, 4}},location=HOST_VIRT,data=[(Virt=0xaaab17d605d0, 64)]} TensorBuffer{@0xaaab17e2a860,tensor=xir::Tensor{name = b, type = FLOAT32, shape = {1, 2, 2, 4}},location=HOST_VIRT,data=[(Virt=0xaaab17c549b0, 64)]}}
{
I1202 09:32:41.499586  1208 test_op_run.cpp:68] write output to dump/add_op.bin from 0xaaab17de7090 size=64
test pass

To verify that the op is implemented properly, compare it with the reference result.
```
% diff -u <(xxd ref/c.bin) <(xxd dump/add_op.bin)
% xxd ref/c.bin
% xxd dump/add_op.bin
```