The data mover comprises four loops (inp_A, inp_B, and out_C), with all concurrently scheduled.
inp_A
inp_B
out_C