Partitioning is the process of splitting the inference execution of a model
between the FPGA and the host. Partitioning is necessary to execute models that contain
layers unsupported by the FPGA. Partitioning can also be useful for debugging and
exploring different computation graph partitioning and execution to meet a target
objective. Following is an example of a Resnet based SSD object detection model. Notice
the parts in the following figure, in red that is replaced by fpga_func_0
node in the partitioned graph. The partitioned code is
complete and executes on both CPU and FPGA.
Note: This support is currently
available for
Alveo™
U200/U250 with use of DPUCADX8G.
Figure 1. Original Graph
Figure 2. Partitioned Graph