Description
In FPGA, non-burst accesses to the DDR memory are very expensive and can impact the overall performance of the design. Hence, it is important to devise a scheme that reduces the time needed to access the necessary information. An efficient solution is to re-write the code or use manual burst, but if that does not work then another solution might be to use a cache memory.
Cache provides a temporary storage area in the M_AXI adapter so the design can more quickly retrieve data. The effectiveness of the cache mechanism is measured in a hit/miss ratio and it is based on a property of computer programs called locality of reference, or the tendency of a program to access the same set of memory locations repetitively over a short period of time. It suggests that when a particular memory block is accessed, it is likely to be accessed again in the near future, along with memory located in close proximity. For instance, when executing consecutive instructions in memory, the next set of instructions to be accessed will most likely be within a contiguous block of data nearby.
config_interface -m_axi_cache_impl
command.Syntax
set_directive_cache <location> port=<name> lines=<value> depth=<value>
Where:
-
<location>
- Specifies the function where the specified ports can be found. This is the top function.
-
port=<name>
- Specifies the port to add cache to.
-
lines=<value>
- Indicates the number of cache lines. The number of lines can be specified as 1, which indicates a single cache line, or a value greater than 1 expressed as a power of 2, indicating multiple cache lines
-
depth=<value>
- Specifies the size of each line in words. The depth must be specified as a power of 2, and indicates the size in words of the pointer datatype for each line.
Limitations
The CACHE directive or pragma has the following limitations:
- Cache is only supported for read-only port
- The cache is implemented as a Single port, Single way cache.
- Cache is not supported for multiple ports connected to the same bundle
Example
The following example shows a design where overlapping access will cause the burst to fail. Using the CACHE pragma or directive will improve the performance of the design.
extern "C" {
void dut(
const double *in, // Read-Only Vector 1
double *out, // Output Result
int size // Size in integer
)
#pragma HLS INTERFACE m_axi port=in bundle=aximm depth = 1026
#pragma HLS INTERFACE m_axi port=out bundle=aximm depth = 1024
#pragma HLS cache port=in lines=8 depth=128
for(int i = 0; i < size; i++)
{
out[i] = in[i] + in[i + 1];
}
}