N2Cube enables priority-based DPU task scheduling using the API dpuSetTaskPriority()
, which can specify a DPU task's
priority to a dedicated value at runtime. The priority ranges from 0 (the highest
priority) to 15 (the lowest priority). If not specified, the priority of DPU Task is 15
by default. This brings flexibility to meet the diverse requirements under various edge
scenarios. You can specify different priorities over the models running simultaneously
so that they are scheduled to DPU cores in a different order when they are all in the
ready state. When affinity is specified, the N2Cube priority-based scheduling also
adheres to DPU cores affinity.
DNNDK samples pose detection demonstrates the feature of DPU priority scheduling. Within
this sample, there are two models used: the SSD model for person detection and the pose
detection model for body key points detection. The SSD model is compiled into the DPU
kernel ssd_person
. The pose detection model is compiled into two DPU
kernels pose_0 and pose_2. Therefore, each input image needs to walk through these three
DPU kernels in the order of ssd_person
, pose_0
and
pose_2
. During a multi-threading situation, several input images
may overlap each other among these three kernels simultaneously. To reach better
latency, DPU tasks for ssd_person
, pose_0
, and
pose_2
are assigned the priorities 3, 2, and 1 individually so that
the DPU task for the latter DPU kernel gets scheduled with a higher priority when they
are ready to run.