These functions are fully configurable fpmul and fpmac functions. The output can be considered to always have eight values because each part of the complex float is treated differently. A vector<cfloat,4> will have the loop interating over real0 - complex0 - real1 - complex1 … This capability is introduced to allow flexibility and implement operations on conjugates.
vector<float,8> fpmac_conf(vector<float,8> acc, vector<float,32> xbuf, int xstart, unsigned int xoffs, vector<float,8> zbuf, int zstart, unsigned int zoffs, bool ones, bool abs, unsigned int addmode, unsigned int addmask, unsigned int cmpmode, unsigned int & cmp)
Returns the multiplication result.
| Parameter | Comment |
|---|---|
| acc | Current accumulator value. This parameter does not exist for fpmul_conf. |
| xbuf | First multiplication input buffer. |
| xstart | Starting offset for all lanes of X. |
| xoffs | 4 bits per lane: Additional lane-dependent offset for X. |
| zbuf | Optional Second multiplication input buffer. If zbuf is not specified, xbuf is taken as the second buffer. |
| zstart | Starting offset for all lanes of Z. This must be a compile time constant. |
| zoffs | 4 bits per lane: Additional lane-dependent offset for Z. |
| ones | If true, all lanes from Z are replaced with 1.0. |
| abs | If true, the absolute value is taken before accumulation. |
| addmode | Select one of the fpadd_add (all add), fpadd_sub (all sub), fpadd_mixadd or fpadd_mixsub (add-sub or sub-add pairs). This must be a compile time constant. |
| addmask | 8 x 1 LSB bits: Corresponding lane is negated if bit is set (depending on addmode). |
| cmpmode | Use fpcmp_lt to select the minimum between accumulator and result of multiplication per lane, fpcmp_ge for the maximum and fpcmp_nrm for the usual sum. |
| cmp | Optional 8 x 1 LSB bits: When using fpcmp_ge or fpcmp_lt in "cmpmode", it sets a bit if accumulator was chosen (per lane). |