In this example, a matrix multiply (AB) example is shown with the simple fpmul and fpmac intrinsics in the real and complex case. In the complex case there are also two other examples using the fpmul_conf and fpmac_conf intrinsics to compute AB and A*conj(B).
Intrinsics being lane by lane computation oriented, this feature will be used to compute a number of consecutive columns of the output matrix. The latency of two of the accumulator is absorbed by computing two rows of the output matrix.
All the parameter settings for the fpmul/mac_conf intrinsics are explained in the code itself.
Navigate to the
MatMultdirectory.Type
make allin the console and wait for the completions of the 3 stages:aieaiesimaieviz
The last stage is opening vitis_analyzer that allows you to visualize the graph of the design and the simulation process timeline.
In this design you learned:
How to organize matrix multiply compute sequence when using real or complex floating-point numbers.
How to handle complex floating-point data and complex floating-points coefficients in FIR filters.
How to use
fpmul_confandfpmac_confintrinsics.