Processing 4 words in parallel will require 32-bits*4 = 128-bits in parallel, but you should access the DDR with 512-bits because the data is contiguous. This will require smaller number of memory accesses.
Use the following interface requirements to create kernel:
Read multiple words stored in the DDR as a 512-bit DDR access, equivalent of reading 16 words per DDR access.
Write multiple flags to the DDR as a 512-bit DDR access, equivalent of writing 32 flags per DDR access.
Compute 4 words to be computed in parallel with each word requiring two
MurmurHash2
functionsCompute the hash (two
MurmurHash2
functions) functions for 4 words every cycle.
Refer to Methodology for Accelerating Applications with the Vitis Software in the in the Application Acceleration Development flow of the Vitis Unified Software Platform Documentation (UG1416).