Input File
The default bit width for input/output streams is 32 bits. The bit width specifies the number of samples per line on the simulation input file. The interpretation of the samples on each line of the input file is dependent on the data type expected and the PLIO data width. The following table shows how the simulator interprets input samples based on the data type and PLIO interface specification.
| Data Type | PLIO 32 bit | PLIO 64 bit | PLIO 128 bit |
|---|---|---|---|
adf::input_plio in =
adf::input_plio::create("DataIn1", adf::plio_32_bits,
"input.txt");
|
adf::input_plio in =
adf::input_plio::create("DataIn1", adf::plio_64_bits,
"input.txt");
|
adf::input_plio in =
adf::input_plio::create("DataIn1", adf::plio_128_bits,
"input.txt");
|
|
| int8 | 4 values per line. For example: 6 8 3 2 |
8 values per line. For example: 6 8 3 2 6 8 3 2 |
16 values per line. For example: 6 8 3 2 6 8 3 2 6 8 3 2 6 8 3 2 |
| int16 | 2 values per line. For example: 24 18 |
4 values per line. For example: 24 18 24 18 |
8 values per line. For example: 24 18 24 18 24 18 24 18 |
| int32 | Single value per line 2386 |
2 values per line. For example: 2386 2386 |
4 values per line. For example: 2386 2386 2386 2386 |
| int64 | N/A | 45678 | 2 values per line. For example: 45678 95578 |
| cint16 | 1 cint value per line – real, imaginary. For
example: 1980 485 |
2 cint values per line. For example: 1980 45 180 85 |
4 cint values per line. For example: 1980 485 180 85 980 48 190 45 |
| cint32 | N/A | 1 cint value per line – real, imaginary. For
example: 1980 485 |
2 cint values per line. For example: 1980 45 180 85 |
| float | 1 floating point value per line. For example: 893.5689 |
2 floating point values per line. For
example: 893.5689 3459.3452 |
4 floating point values per line. For
example: 893.5689 39.32 459.352 349.345 |
| cfloat | N/A | 1 floating point cfloat value per line, real,
imaginary. For example: 893.5689 24156.456 |
2 floating point cfloat values per line, real,
imaginary. For example: 893.5689 24156.456 93.689 256.46 |
| fp16 1 | 2 values per line 1.2 2.2 |
4 values per line 1.2 2.2 3.2 4.2 |
8 values per line 1.2 2.2 3.2 4.2 5.2 6.2 7.2 8.2 |
| mx9 1 | 4 values per line 107 149 115 45 |
8 values per line 107 149 115 45 192 43 55 71 |
16 values per line 107 149 115 45 192 43 55 71 208 44 166 120 179 68 201 41 |
|
|||
Encoding Block Floating-point Data for AI Engine Simulator
Block-floating-point data, also known as MX data represent a set of 16 floating-point numbers. See AI Engine-ML Kernel and Graph Programming Guide (UG1603) for more details on MX data.
The MX data uses the following exponents:
| MX Type | Total Block Size in Bytes | Average bits per Value | Sign bit | Mantissa Width 1 Integer bit + Fractional bits |
|---|---|---|---|---|
| MX9 | 18 | 9 | 1 | 7 |
| MX6 | 12 | 6 | 1 | 4 |
| MX4 | 8 | 4 | 1 | 2 |
PLIO TXT files allow the AI Engine simulator to work with MX9 data. Data samples within PLIO TXT files must be arranged according to the PLIO width. To optimize file usage and leverage the full potential of TXT files, AMD recommends representing the floating point numbers in uint8 format.
Representing the floating point numbers in uint8 format means a single uint8 value encodes 16 floating-point numbers, representing one block of MX9 data. Before providing this block to the AI Engine simulator, the data samples must be arranged based on the PLIO width. The following example illustrates 16 floating point numbers encoded into a single uint8 value.
Set of 16 Floating-Point Values
2.7577e-05 1.0763e-05 -3.0801e-05 2.0654e-05
1.3183e-05 1.708e-05 -3.8159e-05 2.131e-05
-9.2253e-06 2.8738e-05 -2.4526e-05 3.2889e-05
-3.5184e-05 1.9911e-05 2.716e-05 9.2045e-06
Encoding to uint8
The primary and secondary exponent and the mantissa are extracted from the 16 floating point data.
Extract Exponent and Mantissa for the 16 floating-point numbers in uint8
Primary Exponent 107
Secondary Exponent 149
Mantissa
115 45 -64 43 55 71 -80 44 -38 120 -51 68 -73 41 113 38
uin8 representation of MX9: 18 bytes
PrimaryExp SecondaryExp Elem 0 Elem 15
107 149 115 45 192 43 55 71 208 44 166 120 179 68 201 41 113 38
TXT file Representation of the uint8 Value for an Input PLIO Stream of 32-bits
107 149 115 45
192 43 55 71
208 44 166 120
179 68 201 41
113 38 "Automatically padded with zeros if the file ends here"
AMD provides a convenient python utility within the Vitis environment. You can use this utility to encode the 16 floating-point numbers and translate them into human-readable uint8 format for MX9 data. The resulting output is 18 bytes long, containing the primary exponent, secondary exponents, and the mantissa values for each of the 16 numbers.
Similarly, the uint8 representation of MX9 data can be decoded back to the original floating-point numbers using the same python utility. The Vitis install includes Python 3.13.
>source <path/to/Vitis Installation/Vitis/settings.sh>
>python3.13
The following python code demonstrates how to encode floating point numbers in numpy format to MX9, and decode MX9 back to a numpy array. This involves using the "varray" python library within the Vitis environment to convert between the two data formats.
#import varray library
Import varray
Import varray as va
Import numpy as np
#Create 16 random values in numpy array
Data = np.random.rand(16)
#print values
print(Data)
#convert numpy to mx9
m=va.array(data, "mx9")
#prints m - decoded floating point values
print(m)
#prints m - the uint8 values- which includes primary/secondary exponent and mantissa
print(m.bytes)
#print the size of m
print(m.bytes.size)
#convery the uint8 values back to floating point array
np.asarray(m)
You can output the unit8 values generated in the python script to an input
.txt file. Pass the input.txt in the graph during PLIO object creation.
This process is identical to how data samples of other data types are input to the
aiesimulator.
using namespace adf;
class MyGraph : public adf::graph {
adf::kernel mx9Kernel;
public:
adf::input_plio in;
adf::output_plio out;
MyGraph() {
in = adf::input_plio::create("DataIn", adf::plio_32_bits,"data/input.txt");
out = adf::output_plio::create("DataOut", adf::plio_32_bits,"data/output.txt");
The following kernel code takes an mx9 buffer as input and outputs a buffer of the same data type.
/*
*AIE Kernel
*to exercise mx9 data
*/
#pragma once
#include "adf.h"
#include "my_defs.h"
void passThrough(adf::input_buffer<mx9>& __restrict in,
adf::output_buffer<mx9>& __restrict out);
PLIO and Packet Stream Interface Requirements
The following requirements apply when the TXT file provides data that represents a PLIO port and packet stream interface:
-
tkeepis always valid. Thetkeepsignal set to False is not supported. - You can combine and send multiple numbers of data samples in a line,
depending on the width of the interface. The first data sample sends in the
lowest bits of the interface. For example, if the data type is int16 and the AI Engine to PL
interface is 64 bits wide, the line
0 1 2 3is sent as0x0003000200010000to the AI Engine.0 1 2 3 -
tlastin the TXT file denotes that the following line hastlastequals 1. Whentlastis 1, the number of data samples sent to the AI Engine can be equal to or less than the AI Engine to PL interface width. For example, if the data type is int16 and the AI Engine to PL interface is 64 bits wide, the last line4 5is sent as0x00050004withtlastto the AI Engine:0 1 2 3 tlast 4 5 - For the packet stream interface,
tlastequals 1 denotes the end of the packet. The packet must be specified in unsigned decimal format. For example, if the AI Engine to PL interface is 64 bits wide, the following lines send the packet header0x8fff0000(2415853568 in unsigned decimal):tlast 2415853568For the packet stream interface, whose data type is int16 and the AI Engine to PL interface is 64 bits wide, the following lines send the packet header0x8fff0000, packet data0x0, and then packet data0xfffeffffto the AI Engine:2415853568 0 tlast -1 -2
Output File
The simulator can automatically create a file containing the stream content on each output PLIO port. This file uses the same type of declaration as the input PLIO data files.
adf::output_plio out1 = adf::output_plio::create("DataOut1",adf::plio_32_bits,"output1.txt");
adf::output_plio out2 = adf::output_plio::create("DataOut2",adf::plio_64_bits,"output2.txt");
adf::output_plio out3 = adf::output_plio::create("DataOut3",adf::plio_128_bits,"output3.txt");
The format of the input file also applies to the output file. Depending on the data type and the PLIO bitwidth, a number of data is displayed on each line. Each output line is timestamped by the simulator so that you can estimate the data throughput during the simulation. Timestamp units can include the following:
- picosecond (ps)
- nanosecond (ns)
- microsecond (us)
- millisecond (ms)
- second (s)
TLAST is written in the output file if the stream comes from a source that generates a TLAST flag at the end of frame. Below is an example of such an output file.
...
T 15984 ns
4552 4555
T 15988 ns
4558 4561
T 15992 ns
4564 4567
T 15996 ns
4570 4573
T 16 us
4576 4579
T 16004 ns
4582 4585
T 16008 ns
4588 4591
T 16012 ns
4594 4597
T 16016 ns
4600 4603
T 16020 ns
4606 4609
T 16024 ns
TLAST
4612 4615
T 17940 ns
4618 4621
T 17944 ns
4624 4627
T 17948 ns
4630 4633
T 17952 ns
...
In summary, each output of the PLIO port has the following format:
- Timestamp
- TLAST, TKEEP
- Sample DATA values
From the output you can estimate the throughput of the designs for this PLIO port. Timestamps are related only to valid output. When the PLIO port is quiet, there is no indication on the output file.
Compute the throughput as the number of output samples divided by the timestamp difference:
This equation overestimates the throughput if you do not take into account all the clock cycles that occur after the last output sample.
Throughput can be highly overestimated if this is a frame-based output. In that case, the output has the following format:
- Frame 0 output (tstart_0 up to tend_0)
- Quiet interframe
- Frame 1 output (tstart_1 up to tend_1)
- Quiet interframe
- ...
- Frame N-1 output (tstart_N-1 up to tend_N-1)
- Quiet interframe
- Frame N output (tstart_N up to tend_N)
In that case you have to take into account the interframe timelapse for each frame output. This can be done if you use only the N first frames (from 0 to N-1). In that case, replace the time stamps of the throughput equation with the following:
-
FirstTimestamp = tstart_0 -
LastTimestamp = tstart_N(the first output timestamp of the last frame)
Following is an example Python script that calculates the PLIO throughput. The script analyzes the simulation output file and calculates the throughput.
import numpy as np
from math import *
import sys
import argparse
def GetTime_ns(Stamp):
Time_ns = float(Stamp[1])
if(Stamp[2] == 'ps'):
Time_ns = Time_ns/1000.0
elif(Stamp[2] == 'us'):
Time_ns = Time_ns*1000.0
elif(Stamp[2] == 'ms'):
Time_ns = Time_ns*1000000.0
elif(Stamp[2] == 's'):
Time_ns = Time_ns*1000000000.0
return(Time_ns)
def ReadFile(filename):
# Detect the number of data per PLIO output
fdr = open(filename,'r')
ts = fdr.readline()
d = fdr.readline()
dw = d.split()
fdr.close()
coltime = 0
coldata = 1
numdata = len(dw)
coltlast = numdata + 1
# Initializes the output array
# Format: timestamp (in ns) val1 val2 ... valN TLAST (0 or 1)
a = np.zeros((0,numdata+2))
fdr = open(filename,'r')
line = ' '
lnum = 0;
while line !="" :
line = fdr.readline()
if line=='':
continue
res = line.split()
if(res[0] != 'T'): # It should be a timestamp
continue
l = np.zeros((1,numdata+2))
# Extract the time stamp
l[0][0] = GetTime_ns(res)
line = fdr.readline()
res = line.split()
# extract the TLAST
if(res[0]=='TLAST'):
tlast = 1
line = fdr.readline()
res = line.split()
else:
tlast = 0
l[0,coltlast] = tlast
# Extract all values
for i in range(numdata):
l[0,i+1] = float(res[i])
# Appends to the whole array
a = np.append( a , l,axis=0)
fdr.close()
return(a)
def Throughput(Filename,IsComplex):
V = ReadFile(Filename)
print("\n==============================")
print(Filename)
print("\n")
NRows = V.shape[0]
NCols = V.shape[1]
NFullFrames = int(np.sum(V[:,NCols-1]))
print("Number of Full Frames: " + str(NFullFrames))
# Basic Throughput computation
if IsComplex:
Ratio = 0.5
else:
Ratio = 1
RawThroughputMsps = float(NRows*(NCols-2))/(V[NRows-1,0]-V[0,0])*Ratio*1000.0
print("Raw Throughput: %.2f" % RawThroughputMsps)
# If the output is frame based, compute a more precise throughput
tlast = np.where(V[:,NCols-1] == 1.0)
if(len(tlast[0])<=1):
TotalThroughput = RawThroughput
else:
tlast = tlast[0]
EndRow = tlast[len(tlast)-2]+1
# EndRow is the number of Rows I take into account for the number of datasource
# The timestamp I am interested in is the timestamp of the next transaction
TotalThroughputMsps = float(EndRow*(NCols-2))/(V[EndRow,0]-V[0,0])*Ratio*1000.0
print(" Throughput: %.2f" % TotalThroughputMsps)
print("\n")
# Entry point of this file
if __name__ == "__main__":
parser = argparse.ArgumentParser(prog=sys.argv[0], description='Compute the throughput corresponding to some output of AIE Simulations')
parser.add_argument('--iscomplex', action='store_true', help='Indicates Complex data in the file')
parser.add_argument('filename',nargs='+')
Args = sys.argv
Args.pop(0)
args = parser.parse_args(Args)
for f in args.filename:
Throughput(f,args.iscomplex)