fpmul - 2025.2 English - XD100

Vitis Tutorials: AI Engine Development (XD100)

Document ID
XD100
Release Date
2025-12-05
Version
2025.2 English

The simple floating-point multiplier comes in many different flavors mixing or not float and cfloat vector data types. When two cfloat are involved, the intrinsic results in two microcode instructions that must be scheduled. The first buffer can be either 512 or 1024-bit long (vector<float,32>, vector<float,16>, vector<cfloat,16>, vector<cfloat,8>), the second buffer is always 256-bit long (vector<float,8>, vector<cfloat,4>). Any combination is allowed.

vector<float,8> fpmul(vector<float,32> xbuf, int xstart, unsigned int xoffs, vector<float,8> zbuf, int zstart, unsigned int zoffs)

Returns the multiplication result.

Parameter

Comment

xbuf

First multiplication input buffer.

xstart

Starting offset for all lanes of X.

xoffs

4 bits per lane, additional lane-dependent offset for X.

zbuf

Second multiplication input buffer.

zstart

Starting offset for all lanes of Z. This must be a compile time constant.

zoffs

4 bits per lane, additional lane-dependent offset for Z.

for (i = 0 ; i < 8 ; i++)
  ret[i] =  xbuf[xstart + xoffs[i]] * zbuf[zstart + zoffs[i]]