Input Data - Input Data - 2025.2 English

Vitis Libraries

Release Date
2026-02-09
Version
2025.2 English

Input data must have length TP_LEN × 4 samples for each vector (P and Q), regardless of TP_DIM value. The hardware processes data in 4-element groups, using only the first TP_DIM elements for calculations.

Configuration Example:

TP_LEN = 8, TP_DIM = 3, TP_IS_OUTPUT_SQUARED = 1

in_P[32] = [1,2,3,4, 5,6,7,8, 9,10,11,12, ..., 29,30,31,32]
in_Q[32] = [32,31,30,29, 28,27,26,25, 24,23,22,21, ..., 4,3,2,1]

Data Organization:

4-Element Groups (TP_LEN=8):
P0:{1,2,3,4}   P1:{5,6,7,8}   P2:{9,10,11,12}   ...   P7:{29,30,31,32}
Q0:{32,31,30,29} Q1:{28,27,26,25} Q2:{24,23,22,21} ... Q7:{4,3,2,1}

When TP_DIM=3, only first 3 elements are used:
P0→{1,2,3} Q0→{32,31,30} (4th element ignored)

Data Fetching and Processing Pattern

The algorithm processes data in chunks of 4 elements (FIXED_DIM = 4) from each input vector. For TP_LEN = 8, this results in 8 processing iterations, where each iteration compares corresponding 4-element groups from vectors P and Q.

Memory Layout and Access Pattern

Memory Layout (FIXED_DIM = 4):

Vector P Memory:
[1][2][3][4]   [5][6][7][8]   [9][10][11][12]   ...   [29][30][31][32]
    P0             P1              P2                        P7

Vector Q Memory:
[32][31][30][29]   [28][27][26][25]   [24][23][22][21]   ...   [4][3][2][1]
       Q0                 Q1                 Q2                      Q7

Processing Flow:
Iteration 1: P0{1,2,3,4} <--> Q0{32,31,30,29}
Iteration 2: P1{5,6,7,8} <--> Q1{28,27,26,25}
...
Iteration 8: P7{29,30,31,32} <--> Q7{4,3,2,1}

Processing Algorithm (TP_DIM = 3 Example)

For each iteration, only the first TP_DIM elements from each 4-element group are used in calculations. When TP_DIM = 3, the 4th element in each group is ignored. The process repeats for all TP_LEN vector pairs (8 iterations in this example).

Calculation Process:

For each iteration i (0 to 7):

1. Extract TP_DIM elements:     Pi[0:2] and Qi[0:2]
2. Compute differences:         Diff = Qi - Pi  (element-wise)
3. Square differences:          Squared = Diff²  (element-wise)
4. Sum squared differences:     Sum = Σ(Squared)
5. Apply square root:           Distance = √Sum  (if TP_IS_OUTPUT_SQUARED = 0)

Concrete Examples:

D0: P0{1,2,3} vs Q0{32,31,30}  →  {31²+29²+27²}  →  2531  →  √2531 = 50.31
D1: P1{5,6,7} vs Q1{28,27,26}  →  {23²+21²+19²}  →  1331  →  √1331 = 36.48
D2: P2{9,10,11} vs Q2{24,23,22} → {15²+13²+11²}  →  515   →  √515  = 22.69
...
D7: P7{29,30,31} vs Q7{4,3,2}  →  {-25²,-27²,-29²} → 2195  →  √2195 = 46.85

Key Points:

  • Element Selection: Only first TP_DIM elements used (4th element ignored when TP_DIM = 3)
  • Vectorized Operations: Differences, squaring, and summation use hardware vectorization
  • Output Control: TP_IS_OUTPUT_SQUARED determines whether to apply square root

Important

Critical Memory Requirement: ED expects input data of P and Q to have TP_LEN × 4 samples, regardless of the user’s dimension (i.e., TP_DIM).

This means:

  • Always allocate memory for 4 elements per vector group
  • Even if TP_DIM = 1, 2, or 3, you still need 4 elements per group
  • Total memory required: TP_LEN × 4 samples for each input vector
  • Unused elements (when TP_DIM < 4) are ignored during calculation