Migrating Farrow Filter from AIE to AIE-ML - 2024.1 English

Vitis Tutorials: AI Engine

Document ID
XD100
Release Date
2024-10-30
Version
2024.1 English

Migrating Fractional Delay Farrow Filter from AIE to AIE-ML Architecture

Version: Vitis 2024.1

Introduction

A fractional delay filter is a common digital signal processing (DSP) algorithm found in many applications including digital receivers in modems and is required for timing synchronization.

The Fractional Delay Farrow Filter design has already been implemented for the AIE architecture.

Before starting this tutorial on migrating the design from AIE to AIE-ML architecture, it is essential to understand the Farrow Filter and its implementation details with the AIE architecture. This understanding will lay a foundation for grasping the differences and considerations involved in the migration process.

Please study this tutorial Fractional Delay Forrow Filter Targeting AIE Architecture to understand the following:

  1. What is a Farrow Filter?

  2. Requirements and AIE System Partitioning

  3. AI Engine Implementation and Optimization

Now that you have familiarized yourself with the Farrow Filter and its implementation in the AIE architecture, you are ready to migrate the farrow filter to the AIE-ML architecture.

The design requirements are identical here as you are simply migrating the design to AIE-ML architecture:

Requirements
Sampling rate 1 GSPS
I/O data type cint16
Coefficients data type int16
Delay input data type int16

IMPORTANT: Before beginning the tutorial, make sure that you have read and followed the Vitis Software Platform Release Notes (v2024.1) for setting up the software and installing the VEK280 base platform.

Before starting this tutorial, run the following steps:

  1. Set up your platform by running the xilinx-versal-common-v2024.1/environment-setup-cortexa72-cortexa53-xilinx-linux script as provided in the platform download. This script sets up the SYSROOT and CXX variables. If the script is not present, you must run xilinx-versal-common-v2024.1/sdk.sh.

  2. Set up your ROOTFS to point to the xilinx-versal-common-v2024.1/rootfs.ext4.

  3. Set up your IMAGE to point to xilinx-versal-common-v2024.1/Image.

  4. Set up your PLATFORM_REPO_PATHS environment variable based upon where you downloaded the platform.

Table of Contents

Objectives

  • Migrate the farrow filter from AIE to AIE-ML architecture

  • Optimize the design to meet the required performance

  • Modify the interface to GMIO

  • Write a host code with XRT APIs

  • Implement the design using the Vitis tool

  • Run the design on the board

Migrating the Design from AIE to AIE-ML Architecture

Change the Project Path

Switch the device from AIE to AIE-ML and then compile the design to ensure it compiles without errors.
Enter the following command to navigate to the project path of the final AIE design:

cd <path-to-tutorial>/designs/farrow_final_aie

Make sure to set the PLATFORM_REPO_PATHS environment variable.

Source the Vitis Tool

Enter the following command to source the Vitis tool:

source /<TOOL_INSTALL_PATH>/Vitis/2024.1/settings.sh

Update the Makefile to switch the device from AIE to AIE-ML.

Open the Makefile and modify the device from AIE to AIE-ML as shown below:

PLATFORM_USE	  := xilinx_vek280_base_202410_1

Save the file.

Compile the Design for x86 Simulation

Enter the following command to compile for x86 simulation:

make x86compile

Notice the compilation error as shown below:

 In file included from wrap_farrow_kernel1.cpp:2:
./../../farrow_kernel1.cpp:58:19: error: constraints not satisfied for alias template 'sliding_mul_sym_xy_ops' [with Lanes = 8, Points = 8, CoeffStep = 1, DataStepXY = 1, CoeffType = short, DataType = cint16, AccumTag = cacc48]
    acc_f3 = aie::sliding_mul_sym_xy_ops<8,8,1,1,int16,cint16>::mul_antisym(f_coeffs,0,v_buff,9);
                  ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/proj/gsd/vivado/Vitis/2024.1/aietools/include/aie_api/aie.hpp:6492:14: note: because 'arch::is(arch::AIE)' evaluated to false
    requires(arch::is(arch::AIE))
`

What does the compile error indicate?

The error message indicates that the AIE API sliding_mul_sym_xy_ops<> only supports the AIE architecture and not AIE-ML. You can see the error as 'arch::is(arch::AIE)' evaluated to false

Why is the AIE API sliding_mul_sym_xy_ops<> not supported for AIE-ML?

This API uses only half the tap values because it uses the pre-adder to compute the rest of the samples.

Based on the comparison provided between the AIE and AIE-ML architectures regarding fixed-point multiplication paths, it appears that the AIE architecture utilizes a pre-adder mechanism that is absent in the AIE-ML architecture.

Pipeline Diagram for AIE and AIE-ML

How to fix this for AIE-ML?

Additional AIE APIs that can make full use of the tap values for computation need to be identified. One such API is aie::sliding_mul_ops<Lanes, Points, CoeffStep, DataStepXY, DataStepY, int16, cint16>;. You should now adjust the parameter values according to the API details provided in the documentation in the this link AIE APIs Special Multiplication.

The following figure shows the supported parameters type (coeff x data) for AIE and AIE-ML architecture. coeff is int16 and data is cint16.

AIE API Parameters

Initial Porting of Farrow Filter to AIE-ML

Modify the Kernel code using AIE API aie::sliding_mul_ops<>

The parameters for aie:sliding_mul_ops<> are Lanes, Points, CoeffStep, DataStepX, DataStepY, CoeffType, DataType, AccumTag.

For AIE-ML:
Number of lanes are 16
Points can be 8
Accumulator is cacc64
Other parameters use the same value used for AIE architecture:
CoeffStep is 1
DataStepX is 1
DataStepY is 1
CoeffType is int16
DataType is cint16

So, it will be as follows aie::sliding_mul_ops<16, 8, 1, 1, 1,int16,cint16>;

Enter the following command to navigate to the project path of the design:

cd ../farrow_port_initial

Review the kernel code located under <path-to-tutorial>/designs/farrow_port_initial/farrow_kernel1.cpp file. The necessary changes are already made. Study the code and observe the following changes:

  • Accumulator size has been changed to cacc64 (acc_f3, acc_f2, acc_f1, acc_f0) as per the AIE API.

  • Load the full coefficient values (f_coeffs).

  • Vector iterator size updated for 16 lanes (p_sig_i, p_y3, p_y2, p_y2, p_y0), compared to eight lanes in AIE code.

  • sliding_mul API as:

    • aie::sliding_mul_ops< 16, 8, 1, 1, 1, int16, cint16>::mul(f_coeffs,0,v_buff,25);

      • Observe the four filter coefficient start location (0, 8, 16, 24) as second template parameter of aie::sliding_mul_ops<…>::mul(…).

      • It uses the full coefficient length.

Review the kernel code header file located under <path-to-tutorial>/designs/farrow_port_initial/farrow_kernel1.h file.

  • f_taps has full coefficient values

  • TT_ACC has been udpated for cacc64

No changes to the farrow_kernel2.cpp file.

After finishing the review of the kernel code, proceed to compile and then simulate the design.

Compile and Simulate the Design

Enter the following command to compile (x86compile) and simulate (x86sim) to verify the functional correctness of the design:

$ make x86compile
$ make x86sim

The first command compiles the graph code for simulation on an x86 processor, the second command runs the simulation.

To verify the results, make sure you have already invoked MATLAB in your command line and run the following command:

$ make check_sim_output_x86

This command invokes MATLAB to compare the simulator output against golden test vectors. The console should output Max error LSB = 1.

To understand the performance of your initial implementation, you can perform AI Engine emulation using the SystemC simulator by entering the following sequence of commands:

$ make compile
$ make sim
$ make check_sim_output_aie

The first command compiles graph code for the SystemC simulator, the second command runs the AIE simulation, and the final command invokes MATLAB to compare the simulation output with test vectors and compute raw throughput. The average throughput for the IO ports is displayed at the end of AIE simulation. After the final command execution, the console should output as below:

Raw Throughput = 415.7 MSPS
Max error LSB = 1

Analyze the Reports

Enter the following command to launch the Vitis Analyzer and review the reports.

$ vitis_analyzer aiesimulator_output/default.aierun_summary

Select the Graph view.