The overall AI Engine graph of the MNIST ConvNet classifier is shown in the diagram below.
Each layer is generally assigned to its own AI Engine tile as outlined above.
Layers with weights & biases contain two AI Engine tiles. One tile performs the compute workload on the input images. Weights are delivered to this compute tile from a buffer filled once at startup from a second “weight delivery” tile. An asynchronous buffer mechanism reads the weights at design startup from the PLIO and delivers them to the weight input buffer. The compute tile may then access these weights continuously as the design runs.
The
conv2d_w5()
layer is partitioned over four tiles to manage the weight storage which is too large to fit into the available 32 KB of local tile storage. Based on its ~74,000 parameters the storage must be split over a minimum of four AI Engine tiles. Its compute workload is also partitioned over the four tiles, with each tile computing one quarter of the output layer samples. Further details are outlined below.The last AI Engine tile contains the
flatten_w6()
layer, thedense_w7()
layer, and thesoftmax()
compute workload to produce the final classifier output. The max pooling layersmax_pooling2d_w2()
andmax_pooling2d_w4()
do not have any weights and so are implemented using a single AI Engine tile each.