The following table compares the execution times to simulate 12,800 particles for one timestep on the various N-Body simulators explored in this tutorial.
Name |
Hardware |
Algorithm |
Average Execution Time for 1 Timestep (seconds) |
|---|---|---|---|
Python NBody Simulator |
x86 Linux Machine |
O(N) |
14.96 |
C++ NBody Simulator |
A72 Embedded Arm Processor |
O(N2) |
121.295 |
AI Engine NBody Simulator |
Versal AI Engine IP |
O(N) |
0.00888979 |
As you can see, the N-Body Simulator implemented on the AI Engine offers a x2,800 improvement over the Python O(N) implementation. It also offers a x24,800 improvement over the C++ O(N2) implementation. You can use pthreads to create a vectorized C++ NBody Simulator O(N) implementation, but this is not included in this tutorial.