To update portions of vector registers, the upd_v()
, upd_w()
, and upd_x()
intrinsic functions are provided for 128-bit
(v), 256-bit (w), and 512-bit (x) updates.
Similarly, ext_v()
, ext_w()
, and ext_x()
intrinsic functions are provided to extract portions of the vector.
To update or extract individual elements, the upd_elem()
and ext_elem()
intrinsic
functions are provided. These must be used when loading or storing values that are
not in contiguous memory locations and require multiple clock cycles to load or
store a vector. In the following example, the 0th element of vector v1
is updated with the value of a
- which is 100.
int a = 100;
v4int32 v1 = upd_elem(undef_v4int32(), 0, a);
Another
important use is to move data to the scalar unit and do an inverse or
sqrt. In the following example, the 0th element of vector vf
is extracted and stored in the scalar variable
f
.
v4float vf;
float f=ext_elem(vf,0);
float i_f=invsqrt(f);
The shft_elem()
intrinsic function
can be used to update a vector by inserting a new element at the beginning of a
vector and shifting the other elements by one.
ups
intrinsic function. And accumulator registers can
also be half updated by upd_hi
and upd_lo
intrinsic functions.
//From v16int32 to v16acc48
v16int32 v;
v16acc48 acc = upd_lo(acc, ups(ext_w(v, 0), 0));
acc = upd_hi(acc, ups(ext_w(v, 1), 0));