Host Code Modifications - 2022.2 English

Vitis Tutorials: Hardware Acceleration (XD099)

Document ID
XD099
Release Date
2022-12-01
Version
2022.2 English
  1. Open run_sw_overlap.cpp in $LAB_WORK_DIR/reference_files with a file editor.

    Lines 134 through 171 are modified to optimize the host code such that CPU processing is overlapped with FPGA processing. It is explained in detail as follows.

    The following variables are created to keep track of the words processed by the FPGA.

      // Create variables to keep track of number of words needed by CPU to compute score and number of words processed by FPGA such that CPU processing can overlap with FPGA
      unsigned int curr_entry;
      unsigned char inh_flags;
      unsigned int  available = 0;
      unsigned int  needed = 0;
      unsigned int  iter = 0;
    
  2. Block the host only if the hash function of the words are still not computed by FPGA, thereby allowing overlap between the CPU and FPGA processing.

      for(unsigned int doc=0, n=0; doc<total_num_docs;doc++)
      {
        unsigned long ans = 0;
        unsigned int size = doc_sizes[doc];
    
        // Calculate size by needed by CPU for processing next document score
        needed += size;
    
        // Check if flags processed by FPGA is greater than needed by CPU. Else, block CPU
        // Update the number of available words and sub-buffer count(iter)
    
        if (needed > available)
        {
          flagWait[iter].wait();
          available += subbuf_doc_info[iter].size / sizeof(uint);
          iter++;
        }
    
        for (unsigned i = 0; i < size ; i++, n++)
        {
          curr_entry = input_doc_words[n];
          inh_flags  = output_inh_flags[n];
    
          if (inh_flags)
          {
            unsigned frequency = curr_entry & 0x00ff;
            unsigned word_id = curr_entry >> 8;
            ans += profile_weights[word_id] * (unsigned long)frequency;
          }
        }
        profile_score[doc] = ans;
      }