The Hash-Join primitive has 3 versions now. When the requirement for performance is not strict, or the resource is limited, use the default Hash-Join-MPU
version,
which can process 8M distinct key or 2M input with key duplication. It can tolerate hash-collision up to 512 times.
When the key is unique, different rows falling into the same hash entry is significantly boosted,
and it allows up to 2^18 rows to be hashed in the same hash-table slot, including both hash collision and key duplication.
The Hash-Join-v3
primitive can handle hash overflow and even the size of overflow in relatively large. It separates the storage of overflow rows form normal rows and
takes twice number of DDR/HBM ports than Hash-Join-MPU
. Also, it ensures a better performance including larger size of small table, higher throughput and more compatible for possibly large overflow.
The Hash-Join-v4
primitive implements a built-in bloom filter to reduce the redundant memory access. Bloom filter provides at least 64M 1-bit hash entries on URAM of a single SLR in ALVEO platforms,
which is 16-32x larger than the hash index. So the combination of the bloom filter and the hash index is a better solution to improve the performance of hash join.
For internal structure and execution flow, see: