In this step, we are using the same topologies as the previous step, but now we are using RAMA IP to improve the overall bandwidth. This step will require the generation of new xclbins.
The v++ linker requires a tcl file to connect the RAMA IP to the Axi Master ports. Refer to the file ./makefile/rama_post_sys_link.tcl
for more information
The Makefile creates the cfg-rama.ini file shown below and configures the v++ linking phase using --config cfg-rama.ini
option.
[advanced]
param=compiler.userPostSysLinkTcl=<Project>/makefile/rama_post_sys_link.tcl
To build all the xclbins, run the following target.
make build_with_rama
# This command is already executed in the first module
If the machine doesn’t have enough resources to launch six jobs in parallel, you can run the above command one by one, as shown below
make ramajob-64 ramajob-128 ramajob-256 ramajob-512 ramajob-1024 -j6
To run all the variations like in the previous step, You can also use the following Makefile target to build and run the application.
`make all_hbm_rnd_rama_run`
The above target will generate the output file <Project>/makefile/Run_RandomAddressRAMA.perf
file with the following data.
Addr Pattern Total Size(MB) Transaction Size(B) Throughput Achieved(GB/s)
Random 256 (M0->PC0) 64 4.75415
Random 256 (M0->PC0) 128 9.59875
Random 256 (M0->PC0) 256 12.6208
Random 256 (M0->PC0) 512 13.1328
Random 256 (M0->PC0) 1024 13.1261
Random 512 (M0->PC0_1) 64 6.39976
Random 512 (M0->PC0_1) 128 9.59946
Random 512 (M0->PC0_1) 256 12.799
Random 512 (M0->PC0_1) 512 13.9621
Random 512 (M0->PC0_1) 1024 14.1694
Random 1024 (M0->PC0_3) 64 6.39984
Random 1024 (M0->PC0_3) 128 9.5997
Random 1024 (M0->PC0_3) 256 12.7994
Random 1024 (M0->PC0_3) 512 13.7546
Random 1024 (M0->PC0_3) 1024 14.0694
The top 5 rows show the point to point accesses, i.e. 256 MB accesses, with a transaction size variation. The bandwidth achieved is very similar to the previous step without RAMA IP. The next ten rows with access to 512 MB and 1024MB respectively show a significant increase in achieved bandwidth compared to the previous step when configuration didn’t utilised RAMA IP.