These functions are fully configurable fpmul
and fpmac
functions. The output can be considered to always have eight values because each part of the complex float is treated differently. A vector<cfloat,4>
will have the loop interating over real0 - complex0 - real1 - complex1 … This capability is introduced to allow flexibility and implement operations on conjugates.
vector<float,8> fpmac_conf(vector<float,8> acc, vector<float,32> xbuf, int xstart, unsigned int xoffs, vector<float,8> zbuf, int zstart, unsigned int zoffs, bool ones, bool abs, unsigned int addmode, unsigned int addmask, unsigned int cmpmode, unsigned int & cmp)
Returns the multiplication result.
Parameter | Comment |
---|---|
acc | Current accumulator value. This parameter does not exist for fpmul_conf. |
xbuf | First multiplication input buffer. |
xstart | Starting offset for all lanes of X. |
xoffs | 4 bits per lane: Additional lane-dependent offset for X. |
zbuf | Optional Second multiplication input buffer. If zbuf is not specified, xbuf is taken as the second buffer. |
zstart | Starting offset for all lanes of Z. This must be a compile time constant. |
zoffs | 4 bits per lane: Additional lane-dependent offset for Z. |
ones | If true, all lanes from Z are replaced with 1.0. |
abs | If true, the absolute value is taken before accumulation. |
addmode | Select one of the fpadd_add (all add), fpadd_sub (all sub), fpadd_mixadd or fpadd_mixsub (add-sub or sub-add pairs). This must be a compile time constant. |
addmask | 8 x 1 LSB bits: Corresponding lane is negated if bit is set (depending on addmode). |
cmpmode | Use fpcmp_lt to select the minimum between accumulator and result of multiplication per lane, fpcmp_ge for the maximum and fpcmp_nrm for the usual sum. |
cmp | Optional 8 x 1 LSB bits: When using fpcmp_ge or fpcmp_lt in "cmpmode", it sets a bit if accumulator was chosen (per lane). |