Output Buffer Length - 2025.2 English

Vitis Libraries

Release Date
2025-12-17
Version
2025.2 English

The output buffer length is calculated using the formula shown in the OUT_BUFFER_LEN table: ceil((TP_F_LEN + TP_G_LEN - 1), LANES) for FULL mode, ceil(TP_F_LEN, LANES) for SAME mode, and ceil((TP_F_LEN - TP_G_LEN + 1), LANES) for VALID mode. Here, LANES is the number of parallel data lanes available in the AIE hardware, ensuring the buffer is always sized to the next multiple of the hardware lanes for efficient vector processing.

Formula for ceil:

ceil(a,b) ==> (((a+b-1)/b) * b)
Table 3 OUT_BUFFER_LEN
TP_COMPUTE_MODE MODE NAME OUT_BUFFER_LEN
0 FULL ceil((TP_F_LEN + TP_G_LEN - 1), LANES)
1 SAME ceil(TP_F_LEN, LANES)
2 VALID ceil((TP_F_LEN - TP_G_LEN + 1), LANES)

Where:

  • TP_F_LEN is the length of input F vector.
  • TP_G_LEN is the length of input G vector.
  • LANES is the number of parallel data lanes available in the AIE hardware, which depends on the data type combination used. See LANES
Table 4 LANES
InputF Data Type InputG Data Type Output Data Type AIE-1 Lanes AIE-ML Lanes AIE-MLv2 Lanes
int8 int8 int16 0 32 64
int16 int16 int32 16 16 32
int32 int16 int32 8 16 32
cint16 int16 cint16 8 16 32
cint16 int16 cint32 8 16 32
cint16 int32 cint32 8 16 16
cint16 cint16 cint32 8 16 16
cint32 int16 cint32 4 16 16
cint32 cint16 cint32 4 8 16
float float float 8 32 16
cfloat float cfloat 4 0 0
cfloat cfloat cfloat 4 0 0
bfloat16 bfloat16 float 0 16 64

Example Config:

Data_F - int16,
Data_G - int16,
Data_Out - int32,
Func_Type = 1 (conv),
compute_mode = 0 (FULL), 1 (SAME), 2 (VALID),
F_LEN = 64,
G_LEN = 32.

in_F[F_LEN] = [1, 2, 3, ..., 64]
in_G[G_LEN] = [1, 2, 3, ..., 32]

FULL Mode:
   OUT_DATA_LEN = (TP_F_LEN + TP_G_LEN - 1) --> (64+32-1) --> 95
   LANES = 16 for int16xint16 data combo
   Output_Buffer_len = ceil(95,16) --> (((95+16-1)/16)*16) --> ((110/16)*16) --> (6*16)--> 96
   Therefore, the output buffer has 95 valid output samples and 1 zero sample.

SAME Mode:
   OUT_DATA_LEN = TP_F_LEN --> 64
   LANES = 16 for int16xint16 data combo
   Output_Buffer_len = ceil(64,16) --> (((64+16-1)/16)*16) --> ((79/16)*16) --> (4*16)--> 64
   Therefore, the output buffer has 64 valid output samples.

VALID Mode:
   OUT_DATA_LEN = (TP_F_LEN - TP_G_LEN + 1) --> (64-32+1) --> 33
   LANES = 16 for int16xint16 data combo
   Output_Buffer_len = ceil(33,16) --> (((33+16-1)/16)*16) --> ((48/16)*16) --> (3*16)--> 48
   Therefore, the output buffer has 33 valid output samples and 15 zero samples.