Actual results comparing the runs for all modules including execution on the CPU(*):
Module | CPU | Module 1 | Module 2 | Module 3 | Module 4 (NCU=16) |
---|---|---|---|---|---|
Execution Time (µs) | 21,461 | 793,950 | 793,732 | 536,784 | 11,698 |
Speed Up (CPU reference) | 1 | 0.03x | 0.03x | 0.04x | 1.83x |
Speed Up | N/A | 1 | 1 | 1.48x | 68x |
(*): Reference CPU is the Intel® Xeon® Processor E5-2640 v3 (as available on Nimbix)