When the group key is of limited width, it can be used directly as address for on-chip storage in which group aggregation can be implemented. This scenario is described as a “direct group aggregate”. Although the retirement on group key limits its use case, this algorithm is light on LUTs, and the URAM usage could be optimized to fit the expected key width.