The XIR-based compiler works in the context of a framework-independent XIR graph generated from deep learning frameworks. The parser removes the framework-specific attributes in the CNN models and transforms the models into XIR-based computing graphs. The compiler divides the computing graph into different subgraphs, leverages heterogeneous optimizations, and generates corresponding optimized machine codes for subgraphs.
When the model contains operations that the DPU cannot support, some subgraphs are created and mapped to the CPU. The FPGA is so powerful that you can create a specific IP to accelerate those operations for improved end-to-end performance. To enable customized accelerating IPs with an XIR-based toolchain, leverage a pipeline named plugin to extend the XIR and compiler.
In Plugin.hpp, the interface class Plugin is declared. Plugins are executed sequentially before the compiler starts to compile the graph for the DPU. At first, a child subgraph is created for each operator and the plugin picks the operators that it can accelerate. It merges them into larger subgraphs, maps them to the customized IP, and attaches necessary information for runtime (VART::Runner) such as the instructions on the subgraphs.
Implementing a Plugin
- Implement
Plugin::partition()
In
std::set<xir::Subgraph*> partition(xir::Graph* graph)
, pick the desired operations and merge them into device level subgraphs using the following helper functions.-
xir::Subgraph* filter_by_name(xir::Graph* graph, const std::string& name)
returns the subgraph with a specific name -
std::set<xir::Subgraph*> filter_by_type(xir::Graph* graph, const std::string& type)
returns subgraphs with a specific type. -
std::set<xir::Subgraph*> filter_by_template(xir::Graph* graph, xir::GraphTemplate* temp)
returns subgraphs with a specific structure.Figure 2. Filter by Templates -
std::set<xir::Subgraph*> filter(xir::Graph* graph, std::function<std::set<xir::Subgraph*>(std::set<xir::Subgraph*>)> func)
allows you to filter the subgraphs by customized function. This method helps you to find all uncompiled subgraphs.
To merge the child subgraphs, use the
merge_subgraph()
helper function. However, this function can only merge subgraphs at the same level. If the subgraph list can not be merged into one subgraph, the helper function will merge them as far as possible. -
- Specify the name, device, and runner for the subgraphs you
picked in the
Plugin::partition()
function. - Implement
Plugin::compile(xir::Subgraph*)
. This function is called for all the subgraphs returned by thepartition()
function. You can attach information on subgraphs for runtime.
Building the Plugin
Create an extern get_plugin()
function and build the implementations into a shared library.
extern "C" plugin* get_plugin() { return new YOURPLUGIN(); }
Using the Plugin
Use --options '{"plugins":
"plugin0,plugin1"}'
in the vai_c command line option to pass your
plugin library to compiler. When executing your plugin, the compiler opens the
library and makes an instance of your plugin by loading your extern function named
‘get_plugin’. If more than one plugin is specified, they are executed sequentially
in the order defined by the command line option. Compilation for DPU and CPU are
executed after all the plugins have been implemented.
Samples
Check https://github.com/Xilinx/Vitis-AI/tree/master/tools/Vitis-AI-Runtime/VART/plugin-samples for samples.