Parallel Lookup Tables - 2024.2 English - UG1603

AI Engine-ML Kernel and Graph Programming Guide (UG1603)

Document ID
UG1603
Release Date
2024-11-28
Version
2024.2 English

aie::parallel_lookup supports parallel fetches of data from aie::lut by fetch method. For the data types of indexes and values supported, see the aie::parallel_lookup section in the AI Engine API User Guide (UG1529).

An example kernel code:
const int size=1024;
alignas(aie::vector_decl_align) int16 lutab[size*2]={
  #include "data/LUT.h"
};
alignas(aie::vector_decl_align) int16 lutcd[size*2]={
  #include "data/LUT.h"
};
  __attribute__((noinline)) void parallel_lookup(input_buffer<uint8>& 
  __restrict index, output_buffer<int16>& __restrict out){ 
  const aie::lut<4, int16> my_lut(size,lutab,lutcd);
  aie::parallel_lookup<uint8, aie::lut<4, int16>> lookup(my_lut, 0);
  auto it=aie::begin_vector<32>(index);
  auto ot=aie::begin_vector<32>(out);
  for(int i=0;i<size/32;i++){ 
    aie::vector<uint8,32> vin=*it++;
    *ot++ = lookup.fetch(vin);
  }
}

To achieve full parallelism, the LUTs must be placed in different banks. To do so, constrain the LUTs in the graph. For details, see Global Graph-Scoped Tables.