Creating Traffic Generators in Python and C++ - 2023.1 English

AI Engine Tools and Flows User Guide (UG1076)

Document ID
UG1076
Release Date
2023-06-23
Version
2023.1 English

Overview

Simulation and Emulation using external traffic generators can be run by launching the simulator/emulator and the traffic generator (TG) at the same time (in parallel). These TG can be written either in Python or in C++, using multi-threading capabilities of these two languages.

Writing a Traffic Generator in Python

Writing the traffic generator in Python requires various libraries to be imported:
# Mandatory
import os, sys

import multiprocessing as mp
import threading
import struct

from xilinx_xtlm import ipc_axis_master_util
from xilinx_xtlm import ipc_axis_slave_util
from xilinx_xtlm import xtlm_ipc

# Optionnal, just for ease of use
import numpy as np
import logging

The python traffic generator uses APIs available as part of the xtlm_ipc library. You can find the list of the APIs and their usage in Writing Traffic Generators in Python in the Vitis Unified Software Platform Documentation: Application Acceleration Development (UG1393).

To integrate the external traffic generator data in the graph you need to:

  1. Declare the PLIOs in the graph:
    plin = input_plio::create("DataIn1",adf::plio_32_bits);
    
    plout = output_plio::create("DataOut1",adf::plio_32_bits);
  2. Instantiate the corresponding AXI master and slave connections in the python script. This will establish connections to the PLIO ports of the graph:
    plin = ipc_axis_master_util("DataIn1") 
    
    plout = ipc_axis_slave_util("DataOut1")
  3. Send data to the AI Engine.
    1. The data to be sent to the PLIO port in the AI Engine must be set up in the Python script. The traffic generator API expects the data to be sent in the form of stream packet, created using the xtlm_ipc.axi_stream_packet() API .
      packet = xtlm_ipc.axi_stream_packet()

      packet is a structure that contains fields that describe the data to be sent:

      • data_length is the number of bytes of the data
      • data is the data to be sent
      • tlast is the TLAST flag which is set to true or false
        packet.data = data
        packet.tlast = <TLAST> 
    2. After setting above values, the packet is sent to AI Engine using b_transport API
      plin.b_transport(packet)
  4. Receive Data from the AI Engine. Use the sample_transaction API to receive data from the AI Engine

    packet = plout.sample_transaction()

For more details on the API and their usage in the external traffic generator python code, see Writing Traffic Generators in Python in Vitis Unified Software Platform Documentation: Application Acceleration Development (UG1393).

Formatting Data with Traffic Generators in Python

To emulate AXI4-Stream transactions AXI Traffic Generators require the payload data to be broken into appropriately sized bursts. For example, to send 128 bytes with a PLIO width of 32 bits (4 bytes) requires 128 bytes/4 bytes = 32 AXI4-Stream transactions. Converting between bytes arrays and AXI transactions can be handled in Python.

The Python struct library provides a mechanism to convert between Python and C data types. Specifically, the struct.pack and struct.unpack functions pack and unpack byte arrays according to a format string argument. The following table shows format strings for common C data types and PLIO widths.

For more information see: https://docs.python.org/3/library/struct.html

Table 1. Format Strings for C Data Types and PLIO Widths
Data Type PLIO Width Python Code Snippet
cfloat PLIO32 N/A
PLIO64 rVec = np.real(data)

iVec = np.imag(data)

out2column = np.zeros((L,2)).astype(np.single)

out2column.tobytes()

formatString = "<"+str(len(byte_arry)//4)+"f"

PLIO128
cint16 PLIO32 rVec = np.real(data).astype(np.int16)

iVec = np.imag(data).astype(np.int16)

formatString = "<"+str(len(byte_arry)//2)+"h"

PLIO64
PLIO128
int8 PLIO32 intvec = np.real(data).astype(np.int8)

formatString = "<"+str(len(byte_arry)//1)+"b"

PLIO64
PLIO128
int32 PLIO32 intvec = np.real(data).astype(np.int32)

formatString = "<"+str(len(byte_arry)//4)+"i"

PLIO64
PLIO128

Writing a Traffic Generator in C++

When using the C++ language to implement an external traffic generator, various headers are necessary to use some libraries.

The headers useful for handling these libraries are:

# For the traffic generator
#include "xtlm_ipc.h"
#include <thread>

Also, the CPP traffic generator uses APIs available as part of xtlm_ipc sources. For a list of the APIs and their corresponding usage, see Writing Traffic Generators in C++ in the Vitis Unified Software Platform Documentation: Application Acceleration Development (UG1393)

The Makefile dependencies are:
# Libraries directories
PROTO_PATH=$(XILINX_VIVADO)/data/simmodels/xsim/2023.1/lnx64/6.2.0/ext/protobuf/
IPC_XTLM= $(XILINX_VIVADO)/data/emulation/ip_utils/xtlm_ipc/xtlm_ipc_v1_0/cpp/src/
IPC_XTLM_INC= $(XILINX_VIVADO)/data/emulation/ip_utils/xtlm_ipc/xtlm_ipc_v1_0/cpp/inc/
LOCAL_IPC= $(IPC_XTLM)../

LD_LIBRARY_PATH:=$(XILINX_VIVADO)/data/simmodels/xsim/2023.1/lnx64/6.2.0/ext/protobuf/:$(XILINX_VIVADO)/lib/lnx64.o/Default:$(XILINX_VIVADO)/lib/lnx64.o/:$(LD_LIBRARY_PATH)

# Kernel directories
PLKERNELS_DIR := ../../pl_kernels
PLKERNELS := $(PLKERNELS_DIR)/polar_clip.cpp
PLHEADERS := $(PLKERNELS_DIR)/polar_clip.hpp $(PLKERNELS_DIR)/s2mm.hpp $(PLKERNELS_DIR)/mm2s.hpp

# XTLM source files
IPC_SRC := $(LOCAL_IPC)/src/axis/*.cpp $(LOCAL_IPC)/src/common/*.cpp $(LOCAL_IPC)/src/common/*.cc

# Compiler/linker flags
INC_FLAGS := -I$(LOCAL_IPC)/inc -I$(LOCAL_IPC)/inc/axis/ -I$(LOCAL_IPC)/inc/common/ -I$(PROTO_PATH)/include/ -I$(PLKERNELS_DIR) -I$(XILINX_HLS)/include
LIB_FLAGS := -L$(PROTO_PATH)/ -lprotobuf -L$(XILINX_VIVADO)/lib/lnx64.o/ -lrdizlib -L$(GCC)/../../lib64/ -lstdc++ -lpthread

# Compilation
compile: main.cpp $(PLHEADERS) $(PLKERNELS)
  $(GCC) -g main.cpp $(PLKERNELS) $(IPC_SRC) $(INC_FLAGS) $(LIB_FLAGS) -o chain

The headers useful for handling these libraries are:

# For the traffic generator
#include "xtlm_ipc.h"
#include <thread>

Also, the CPP traffic generator uses APIs available as part of xtlm_ipc sources. For a list of the APIs and their corresponding usage, see Writing Traffic Generators in Python in the Vitis Unified Software Platform Documentation: Application Acceleration Development (UG1393).

Below are the steps to integrate external traffic generator using CPP APIs:

  • Declare the external PLIOs in the graph code as below:
    plin = input_plio::create("DataIn1",adf::plio_32_bits);
    
    plout = output_plio::create("DataOut1",adf::plio_32_bits);
  • Instantiate AXI master and sender.
    xtlm_ipc::axis_initiator_socket_util<xtlm_ipc::BLOCKING> plin("DataIn1")
    
    xtlm_ipc::axis_target_socket_util<xtlm_ipc::BLOCKING> plout("DataOut1")
  • Prepare the data. This is the user logic.
  • Send the data.

    A simple API is available if you prefer not to have fine granular control and send the data.

    std::vector<char> data; // The sender API expects data to be in the form of vector of char
    // Write a user logic to fill in the data
    
    plin.transport(data, size); // Send the data using transport API call

    For advanced users who need fine granular control over AXI4-Stream, use the following:

    xtlm_ipc::axi_stream_packet packet;
    
    // set the packet fields
    
    packet.set_data(data.data(), data.size()); 
    
    packet.set_data_length(data.size()); 
    
    packet.set_tlast(1); // optional argument based on the application 

    Send the packets using the transport call:

    plin.transport(packet)
  • Receive the data.

    A simple API is available if you prefer not to have fine granular control

    std::vector<char> data; // create empty vector of char
    
    plout.sample_transcation(data);

    For advanced users who need fine granular control over AXI4-Stream, use the following:

    xtlm_ipc::axi_stream_packet packet;
    
    plout.sample_transaction(packet); //API to sample the transaction

    For more details on the API and their usage in the external traffic generator CPP code, see Writing Traffic Generators in C++ in the Vitis Unified Software Platform Documentation: Application Acceleration Development (UG1393).

In this example, classes are used to handle the various functionality of the traffic generators:
class mm2s
{
std::thread m_thread;
std::unique_ptr<b_init_socket> m_socket_ptr;
int count;

void sock_data_handler()
{
m_socket_ptr = std::make_unique<b_init_socket>(m_sock_name);
std::vector<char> data_to_send;

while (count<512)
  {
  // Create a data to send ot the AI Engine Arra (vector of bytes)
  data_to_send = ...;

  m_socket_ptr->transport(data_to_send,count%128==127?true:false); // transport(data, tlast),   128 sample frame

  count++;
  }
}


protected :
// Name of the socket
const std::string m_sock_name;

public:
mm2s(const std::string sock_name) :
m_sock_name(sock_name), m_socket_ptr(nullptr),count(0)
{}

void run()
{
m_thread = std::thread(&mm2s::sock_data_handler, this);
}

// This function allows the user to check for the end of the transmission
int dataTransferred()
{
return(count);
}

// The destructor ends the thread
virtual ~mm2s()
{
std::cout << this->m_sock_name << " before join " << std::endl;
if(m_thread.joinable())
  m_thread.join();
std::cout << this->m_sock_name << " after join " << std::endl;
}
};
The main function is very simple as is meant only to start the various components of the traffic generator, while inserting some delays in between them to allow for the system to initialize without pushing too much:
int main(int argc, char *argv[])
{

mm2s chain_1_mm2s("DataIn1");
polar_clip chain_1_pc ("clip_in", "clip_out");
s2mm_chain_1_s2mm("DataOut1");

using namespace std::chrono_literals;

chain_1_mm2s.run();
std::cout << "Started mm2s " << std::endl;
std::this_thread::sleep_for(500ms);

chain_1_pc.run();
std::cout << "Started polar_clip " << std::endl;
std::this_thread::sleep_for(400ms);

chain_1_s2mm.run();
std::cout << "Started s2mm " << std::endl;

# Waits for the end of the simulation (1024 samples received from S2MM block)
while(chain_1_s2mm.dataTransferred()!=1024)
  {
  // Waits 2s before retesting
  std::this_thread::sleep_for(2s);
  }
return(0)
}

The interest of the C++ traffic generator is that you can use and test your HLS kernels as soon as they are created, without having to synthesize them in a .xo file. This allows you to add more and more realism and flexibility to your simulations without having to recreate a .xclbin file.