The following Vitis AI advanced low-level C++ programming APIs are briefly summarized.
Name
libn2cube.so
Description
DPU runtime library
Routines
- dpuOpen()
- Open & initialize the usage of DPU device
- dpuClose()
- Close & finalize the usage of DPU device
- dpuLoadKernel()
- Load a DPU Kernel and allocate DPU memory space for its Code/Weight/Bias segments
- dpuDestroyKernel()
- Destroy a DPU Kernel and release its associated resources
- dpuCreateTask()
- Instantiate a DPU Task from one DPU Kernel, allocate its private working memory buffer and prepare for its execution context
- dpuRunTask()
- Launch the running of DPU Task
- dpuDestroyTask()
- Remove a DPU Task, release its working memory buffer and destroy associated execution context
- dpuSetTaskPriority()
- Dynamically set a DPU Task's priority to a specified value at run-time. Priorities range from 0 (the highest priority) to 15 (the lowest priority). If not specified, the priority of a DPU Task is 15 by default,
- dpuGetTaskPriority()
- Retrieve a DPU Task's priority.
- dpuSetTaskAffinity()
- Dynamically set a DPU Task's affinity over DPU cores at run-time. If not specified, a DPU Task can run over all the available DPU cores by default.
- dpuGetTaskAffinity()
- Retrieve a DPU Task's affinity over DPU cores.
- dpuEnableTaskDebug()
- Enable dump facility of DPU Task while running for debugging purpose
- dpuEnableTaskProfile()
- Enable profiling facility of DPU Task while running to get its performance metrics
- dpuGetTaskProfile()
- Get the execution time of DPU Task
- dpuGetNodeProfile()
- Get the execution time of DPU Node
- dpuGetInputTensorCnt()
- Get total number of input Tensor of one DPU Task
- dpuGetInputTensor()
- Get input Tensor of one DPU Task
- dpuGetInputTensorAddress()
- Get the start address of one DPU Task’s input Tensor
- dpuGetInputTensorSize()
- Get the size (in byte) of one DPU Task’s input Tensor
- dpuGetInputTensorScale()
- Get the scale value of one DPU Task’s input Tensor
- dpuGetInputTensorHeight()
- Get the height dimension of one DPU Task’s input Tensor
- dpuGetInputTensorWidth()
- Get the width dimension of one DPU Task’s input Tensor
- dpuGetInputTensorChannel()
- Get the channel dimension of one DPU Task’s input Tensor
- dpuGetOutputTensorCnt()
- Get total number of output Tensor of one DPU Task
- dpuGetOutputTensor()
- Get output Tensor of one DPU Task
- dpuGetOutputTensorAddress()
- Get the start address of one DPU Task’s output Tensor
- dpuGetOutputTensorSize()
- Get the size in byte of one DPU Task’s output Tensor
- dpuGetOutputTensorScale()
- Get the scale value of one DPU Task’s output Tensor
- dpuGetOutputTensorHeight()
- Get the height dimension of one DPU Task’s output Tensor
- dpuGetOutputTensorWidth()
- Get the width dimension of one DPU Task’s output Tensor
- dpuGetOutputTensorChannel()
- Get the channel dimension of one DPU Task’s output Tensor
- dpuGetTensorSize()
- Get the size of one DPU Tensor
- dpuGetTensorAddress()
- Get the start address of one DPU Tensor
- dpuGetTensorScale()
- Get the scale value of one DPU Tensor
- dpuGetTensorHeight()
- Get the height dimension of one DPU Tensor
- dpuGetTensorWidth()
- Get the width dimension of one DPU Tensor
- dpuGetTensorChannel()
- Get the channel dimension of one DPU Tensor
- dpuSetInputTensorInCHWInt8()
- Set DPU Task’s input Tensor with data stored under Caffe order (channel/height/width) in INT8 format
- dpuSetInputTensorInCHWFP32()
- Set DPU Task’s input Tensor with data stored under Caffe order (channel/height/width) in FP32 format
- dpuSetInputTensorInHWCInt8()
- Set DPU Task’s input Tensor with data stored under DPU order (height/width/channel) in INT8 format
- dpuSetInputTensorInHWCFP32()
- Set DPU Task’s input Tensor with data stored under DPU order (channel/height/width) in FP32 format
- dpuGetOutputTensorInCHWInt8()
- Get DPU Task’s output Tensor and store them under Caffe order (channel/height/width) in INT8 format
- dpuGetOutputTensorInCHWFP32()
- Get DPU Task’s output Tensor and store them under Caffe order (channel/height/width) in FP32 format
- dpuGetOutputTensorInHWCInt8()
- Get DPU Task’s output Tensor and store them under DPU order (channel/height/width) in INT8 format
- dpuGetOutputTensorInHWCFP32()
- Get DPU Task’s output Tensor and store them under DPU order (channel/height/width) in FP32 format
- dpuRunSoftmax ()
- Perform softmax calculation for the input elements and save the results to output memory buffer.
- dpuSetExceptionMode()
- Set the exception handling mode for edge DPU runtime N2Cube.
- dpuGetExceptionMode()
- Get the exception handling mode for runtime N2Cube.
- dpuGetExceptionMessage()
- Get the error message from error code (always negative value) returned by N2Cube APIs.
- dpuGetInputTotalSize()
- Get total size in byte for DPU task’s input memory buffer, which includes all the boundary input tensors.
- dpuGetOutputTotalSize()
- Get total size in byte for DPU task’s outmemory buffer, which includes all the boundary output tensors.
- dpuGetBoundaryIOTensor()
- Get DPU task’s boundary input or output tensor from the specified tensor name. The info of tensor names is listed out by VAI_C compiler after model compilation.
- dpuBindInputTensorBaseAddress()
- Bind the specified base physical and virtual addresses of input memory buffer to DPU task. It can only be used for DPU kernel compiled by VAI_C under split IO mode. Note it can only be used for DPU kernel compiled by VAI_C under split IO mode.
- dpuBindOutputTensorBaseAddress()
- Bind the specified base physical and virtual addresses of output memory buffer to DPU task. Note it can only be used for DPU kernel compiled by VAI_C under split IO mode.
Include File
n2cube.h