The target audience of the L3 API (General Query Engine, GQE) are users who want to link a shared library and call the API to accelerate part of an execution plan on FPGA cards.
The major feature of L3 API are:
- Generalized query execution. L3 API pre-defined operator combinations like “scan + filter + aggregation + write”, “scan + filter + bloom filter + write”, “scan + filter + hash-join + write”, and “scan + filter + aggregation + write” with filter condition support comparision between four input columns and two constants. Aggregation support max/min/sum/count/mean/variance/norm_L1/norm_L2. In this way, L3 APIs could support a generalized query operators.
- Automatic card management. As soon as program created an instance of GQE, it will scan the machine and find all qualified AMD FPGA cards by their shell name. It will load the cards with the xclbins, create context/command queue/kernel/host buffer/device buffer/ job queue for each card. It will keep alive until you call the release() functions. This will finish all the initialization automatically and save the overhead to repeat such setup each time you call the GQE API.
- Light weight memory management. The input and output of GQE are data structure call “TableSection”. It only contains pointers to memories which are user allocated. In such way, GQE will not do memory allocation related to input/output. This will make it easier for you to integrate because it wil not impact the original DBMS’s memory pool managment.
- Asynchronous API call. Input for processing will be cut into multiple sections of rows. The GQE API requries you to provide a std::future type argument for each row, to indicate the readiness of the input. GQE also requires a std::promise type argument for each output section, to notify the caller thread that the result is ready. GQE API will push all input arguments into an internal job queue and return immediately. Actual processing will not begin until the corresponding std::future arguments for input is ready. This will separate input preparing from the actual GQE processing. GQE could start processing the ready sections ahead even if not all input sections are ready. It will help pipeline the “preparing” and “processing” and improve system performance.
- Column-oriented. Columnar DBMS will benefit from only accessing subset of columns and more options for data compression.