For K ordered blocks as input, it can get all data ordered output. The size of K cannot affect the frequency of the kernel, and it can also ensure that each cycle outputs a data. ts design is show in the figure below: