The weights, bias, and intermediate feature
maps are buffered in the on-chip memory. The on-chip memory consists of RAM which can be
instantiated as block RAM and UltraRAM. The RAM Usage option determines the total amount
of on-chip memory used in different DPUCZDX8G
architectures, and the setting is for all the DPUCZDX8G cores in the DPUCZDX8G IP. High RAM Usage means that the on-chip memory block will
be larger, allowing the DPUCZDX8G more
flexibility in handling the intermediate data. High RAM Usage implies higher performance
in each DPUCZDX8G
core. The number of BRAM36K blocks used
in different architectures for low and high RAM Usage is illustrated in the following
table.
Note: The DPUCZDX8G instruction set for different options
of RAM Usage is different. When the RAM Usage option is modified, the DPUCZDX8G instructions file should be
regenerated by recompiling the neural network. The following results are based on a
DPUCZDX8G with depthwise
convolution.
DPUCZDX8G Architecture | Low RAM Usage | High RAM Usage |
---|---|---|
B512 | 72 | 88 |
B800 | 90 | 108 |
B1024 | 104 | 136 |
B1152 | 121 | 143 |
B1600 | 126 | 162 |
B2304 | 165 | 209 |
B3136 | 208 | 260 |
B4096 | 255 | 315 |