The stencil pattern algorithm is a computational pattern used extensively in the field of scientific computing, image processing, and numerical simulations.
- This algorithm updates the value of each element in an array by applying a predetermined pattern or "stencil," which dictates how to amalgamate the values of adjacent elements.
- For instance, within image processing contexts, a stencil computation could calculate a weighted average of a pixel's value with its immediate neighbors to achieve a blurring effect. Stencil computations are demanding in terms of memory bandwidth on hardware like FPGAs, as the non-sequential nature of stencil pixel locations necessitates numerous DDR memory reads, markedly prolonging computation times.
- To mitigate this, stencil pattern algorithms will be implemented using window and line buffer techniques. These techniques optimize data access patterns, reducing the need for multiple off-chip memory reads by caching relevant data in an on-chip memory.
These techniques can improve memory bandwidth utilization and increase the effectiveness of parallelization and pipelining.
Adapting algorithms to leverage line and window buffers can be time-intensive, requiring considerable code refactoring. Vitis HLS introduces stencil optimization/pragma, which can automatically implement the line buffer and window buffer and achieve the same performance.
for(int y=0; y<30; ++y)
{
for(int x=0; x<1000; ++x)
{
#pragma HLS pipeline II=1
#pragma HLS array_stencil variable=src
// Apply 2D filter to the pixel window
int sum = 0;
for(int row=0; row<FILTER_V_SIZE; row++)
{
for(int col=0; col<FILTER_H_SIZE; col++)
{
unsigned char pixel;
int xoffset = (x+col-(FILTER_H_SIZE/2));
int yoffset = (y+row-(FILTER_V_SIZE/2));
// Deal with boundary conditions : clamp pixels to 0 when outside of image
if ( (xoffset<0) || (xoffset>=1000) || (yoffset<0) || (yoffset>=30) ) {
pixel = 0;
} else {
pixel = src[yoffset][xoffset];
}
sum += pixel*coeffs[row][col];
}
}