The weights, bias, and intermediate features are buffered
in the on-chip memory. The on-chip memory consists of RAM which can be instantiated as
block RAM and UltraRAM. The RAM Usage option determines the total amount of on-chip
memory used in different DPUCZDX8G
architectures, and the setting is for all the DPUCZDX8G cores in the DPUCZDX8G IP. High RAM Usage means that the on-chip memory block will
be larger, allowing the DPUCZDX8G more
flexibility in handling the intermediate data. High RAM Usage implies higher performance
in each DPUCZDX8G
core. The number of BRAM36K blocks used
in different architectures for low and high RAM Usage is illustrated in the following
table.
Note: The DPUCZDX8G
instruction set for different options of RAM Usage is different. When the RAM Usage
option is modified, the DPUCZDX8G
instructions file should be regenerated by recompiling the neural network. The
following results are based on a DPUCZDX8G
with depthwise convolution.
DPUCZDX8G Architecture | Low RAM Usage | High RAM Usage |
---|---|---|
B512 (4x8x8) | 73.5 | 89.5 |
B800 (4x10x10) | 91.5 | 109.5 |
B1024 (8x8x8) | 105.5 | 137.5 |
B1152 (4x12x12) | 123 | 145 |
B1600 (8x10x10) | 127.5 | 163.5 |
B2304 (8x12x12) | 167 | 211 |
B3136 (8x14x14) | 210 | 262 |
B4096 (8x16x16) | 257 | 317.5 |