Change the Makefile to pick host_p2p.cpp instead of host.cpp #HOST_SRC := host.cpp HOST_SRC := host_p2p.cpp Delete host.exe and recreate rm -rf host.exe make app Run as before ./host.exe Execution of the kernel on device1 Buffer = 16384 Iterations = 1024 Throughput= 0.78GB/s TEST PASSED