void GramSchmidtKernelComplexFloat::process(input_stream_cfloat* in_0,
input_stream_cfloat* in_1,
output_stream_cfloat* out_0,
output_stream_cfloat* out_1);
Note
- To utilize bandwidth of input / output stream, the input matrix and output result are transfered in such way: Elem[N*4] and Elem[N*4+1] are transferred with in_0 / out_0, Elem[N*4 + 2] and Elem[N*4 + 3] are transferred with in_1 / out_1.
- Input:
input_stream_cfloat* in_0
stream of input matrix, contains lower two elements of each 4 elements.input_stream_cfloat* in_1
stream of input matrix, contains higher two elements of each 4 elements.
- Output:
input_stream_cfloat* out_0
stream of output matrix, contains lower two elements of each 4 elements.input_stream_cfloat* out_1
stream of output matrix, contains higher two elements of each 4 elements.