AI Engine Code Vectorization - 2023.2 English

Vitis Tutorials: AI Engine (XD100)

Document ID
XD100
Release Date
2024-03-05
Version
2023.2 English

To realize advantages of AI Engine processing, code must be vectorized. Applying this to pixel interpolation, the calculation may be restated as:

$$ \begin{bmatrix} 1 - x_{frac} \ 1 - x_{frac} \ x_{frac} \ x_{frac} \end{bmatrix} .* \begin{bmatrix} 1 - y_{frac} \ y_{frac} \ 1 - y_{frac} \ y_{frac} \end{bmatrix} .* \begin{bmatrix} f(x_1,y_1) \ f(x_1,y_2) \ f(x_2,y_1) \ f(x_2,y_2) \end{bmatrix} \rightarrow \sum ( \cdot ) \rightarrow f(x_q,y_q). $$

The .* operator denotes element-wise products of vectors, which corresponds to AI Engine vector multiplication. Two vector multiplications are performed, where weights derived from $x_{frac}$ are multiplied with weights derived from $y_{frac}$ and then multiplied by corresponding pixel values. The summation denotes addition of vector product elements, resulting in an interpolated pixel value.

While the vector containing pixel values is acquired directly from kernel input, the other two need to be constructed from values $x_{frac}$ and $y_{frac}$. Assigning individual components of the vectors would involve the scalar processor and impact performance, so a vector formulation is used instead. Restating the weight vectors as

$$ \begin{bmatrix} 1 - x_{frac} \ 1 - x_{frac} \ x_{frac} \ x_{frac} \end{bmatrix} = \begin{bmatrix} 1 \ 1 \ 0 \ 0 \end{bmatrix} + \begin{bmatrix} -1 \ -1 \ 1 \ 1 \end{bmatrix} .* \begin{bmatrix} x_{frac} \ x_{frac} \ x_{frac} \ x_{frac} \end{bmatrix} $$

and

$$ \begin{bmatrix} 1 - y_{frac} \ y_{frac} \ 1 - y_{frac} \ y_{frac} \end{bmatrix} = \begin{bmatrix} 1 \ 0 \ 1 \ 0 \end{bmatrix} + \begin{bmatrix} -1 \ 1 \ -1 \ 1 \end{bmatrix} .* \begin{bmatrix} y_{frac} \ y_{frac} \ y_{frac} \ y_{frac} \end{bmatrix}, $$

shows they are efficiently constructed using vector multiply-accumulate operations.