Change the Makefile to pick host_p2p.cpp
instead of host.cpp
#HOST_SRC := host.cpp
HOST_SRC := host_p2p.cpp
Delete host.exe
and recreate
Run as before
./host.exe
Execution of the kernel on device1
Buffer = 16384 Iterations = 1024
Throughput= 0.78GB/s
TEST PASSED