Implementation - 2024.2 English

Vitis Libraries

Release Date
2024-11-29
Version
2024.2 English

The detail algorithm implementation is illustrated below:

Diagram of General Similarity

General Similarity contains most of modules in Sparse Similarity and Dense Similarity. It has two DataLoader, which can process sparse and dense inputs by config. It is shown in the above API picture, each PE has four AXI ports to store input data. For sparse mode, each PE has three valid AXI inputs, which are corresponding to offset, indice and weight, so that there is a dangling port. In dense mode, the partitioned weight data are stored in each AXI and it can improve the data loading speed, which can significantly impact the final performance. After DataLoader, the input data is transform to COO stream internally so that it can share most of the calculation logic between dense and sparse mode. The overall diagram of a general similarity kernel has a insert sort module, which returns the top K number of similarity values. The maximum number of K is a template number, which can be changed by rebuilding the xclbin. The default value of top K is 32.