Description
This issue is related to long-run times caused by a huge number of loads and stores instructions.
Explanation
The huge number of load/store operations are caused by one of the following reasons:
- Completly Partitioning Arrays size > 1024
- Complete Unrolling huge Loops
- The huge number of ops generated by the tool
Solution
Partitioning arrays: Divide the single dimension into multiple dimension and partition the desired dim by a factor. This will reduce in a huge number of ops generation.
Loop Unrolling: Move the source code into multiple serate functions adn unroll each of them which will result in less number of ops or refactor the code or unroll it with a factor