In this example, a matrix multiply (AB) example is shown with the simple fpmul and fpmac intrinsics in the real and complex case. In the complex case there are also two other examples using the fpmul_conf
and fpmac_conf
intrinsics to compute AB and A*conj(B).
Intrinsics being lane by lane computation oriented, this feature will be used to compute a number of consecutive columns of the output matrix. The latency of two of the accumulator is absorbed by computing two rows of the output matrix.
All the parameter settings for the fpmul/mac_conf
intrinsics are explained in the code itself.
Navigate to the
MatMult
directory.Type
make all
in the console and wait for the completions of the 3 stages:aie
aiesim
aieviz
The last stage is opening vitis_analyzer
that allows you to visualize the graph of the design and the simulation process timeline.
In this design you learned:
How to organize matrix multiply compute sequence when using real or complex floating-point numbers.
How to handle complex floating-point data and complex floating-points coefficients in FIR filters.
How to use
fpmul_conf
andfpmac_conf
intrinsics.