Xilinx has enhanced Caffe package to automatically partition a Caffe graph. This function separates the FPGA executable layers in the network and generates a new prototxt, which is used for the inference. The subgraph cutter creates a custom python layer to be accelerated on the FPGA. The following code snippet explains the code:
from vai.dpuv1.rt.scripts.framework.caffe.xfdnn_subgraph \
import CaffeCutter as xfdnnCutter
def Cut(prototxt):
cutter = xfdnnCutter(
inproto="quantize_results/deploy.prototxt",
trainproto=prototxt,
outproto="xfdnn_auto_cut_deploy.prototxt",
outtrainproto="xfdnn_auto_cut_train_val.prototxt",
cutAfter="data",
xclbin=XCLBIN,
netcfg="work/compiler.json",
quantizecfg="work/quantizer.json",
weights="work/deploy.caffemodel_data.h5"
)
cutter.cut()
#cutting and generating a partitioned graph auto_cut_deploy.prototxt
Cut(prototxt)