XRT allocates memory space in 4K boundary for internal memory management. If
the host memory pointer is not aligned to a page boundary, XRT performs extra memcpy
to make it aligned. Hence you should align the host memory
pointer with the 4K boundary to save the extra memory copy operation.
The following is an example of how posix_memalign
is used instead of malloc
for the
host memory space pointer.
int *host_mem_ptr; // = (int*) malloc(MAX_LENGTH*sizeof(int));
// Aligning memory in 4K boundary
posix_memalign(&host_mem_ptr,4096,MAX_LENGTH*sizeof(int));
// Fill the memory input
for(int i=0; i<MAX_LENGTH; i++) {
host_mem_ptr[i] = <... >
}
cl_mem dev_mem_ptr = clCreateBuffer(context,
CL_MEM_READ_WRITE | CL_MEM_USE_HOST_PTR,
sizeof(int) * number_of_words, host_mem_ptr, NULL);
err = clEnqueueMigrateMemObjects(commands, 1, dev_mem_ptr, 0, 0,
NULL, NULL);