Open
compute_score_host.cpp
file in the file editor.Look at the following code at lines 32 through 58 that computes the output flags.
// Compute output flags based on hash function output for the words in all documents for(unsigned int doc=0;doc<total_num_docs;doc++) { profile_score[doc] = 0.0; unsigned int size = doc_sizes[doc]; for (unsigned i = 0; i < size ; i++) { unsigned curr_entry = input_doc_words[size_offset+i]; unsigned word_id = curr_entry >> 8; unsigned hash_pu = MurmurHash2( &word_id , 3,1); unsigned hash_lu = MurmurHash2( &word_id , 3,5); bool doc_end = (word_id==docTag); unsigned hash1 = hash_pu&hash_bloom; bool inh1 = (!doc_end) && (bloom_filter[ hash1 >> 5 ] & ( 1 << (hash1 & 0x1f))); unsigned hash2 = (hash_pu+hash_lu)&hash_bloom; bool inh2 = (!doc_end) && (bloom_filter[ hash2 >> 5 ] & ( 1 << (hash2 & 0x1f))); if (inh1 && inh2) { inh_flags[size_offset+i]=1; } else { inh_flags[size_offset+i]=0; } } size_offset+=size; }
From this code, you can see:
You are computing two hash outputs for each word in all the documents and creating output flags accordingly.
You already determined that the hash function(
MurmurHash2()
) is a good candidate for acceleration on the FPGA.The hash (
MurmurHash2()
) function with one word is independent of other words and can be done in parallel which improves the execution time.The algorithm sequentially accesses to the
input_doc_words
array. This is an important property because when implemented in the FPGA, it allows for very efficient accesses to the DDR.
This code section is a good candidate for FPGA acceleration because the hash function can run faster on the FPGA, and you can compute hashes for multiple words in parallel by reading multiple words from the DDR in burst mode.
Now, you are ready to review the second for loop.
Keep
compute_score_host.cpp
open in the file editor.