In this tutorial, each document consists of an array of words where: each word is a 32-bit unsigned integer comprised of a 24-bit word ID and an 8-bit integer representing the frequency. The search array consists of words of interest to the user, and represents a smaller set of 24-bit word IDs, where each word ID has a weight associated with it, determining the importance of the word.
Navigate to
Hardware_Acceleration/Design_Tutorials/02-bloom
directory.Go to the
cpu_src
directory, open themain.cpp
file, and look at line 63.The Bloom filter application is 64 KB, which is implemented as
1L<<bloom_size
wherebloom_size
is defined as 14 in the header filesizes.h
(calculated as(2^14)*4B = 64 KB
).The score for each document is obtained by the cumulative product of multiplying the weight of word ID with its frequency. The greater the score, the more relevant the document that matches the search array.