The first step of this tutorial is to build a computer model of the MNIST ConvNet and to train the model to obtain a set of weights that may be used for inference. The tutorial uses the Keras framework for this purpose. Keras provides a simple, flexible, and powerful framework built on top of TensorFlow for building machine learning network models. The full Jupyter Notebook of this model is provided by MNIST-Convnet-demo.ipynb. To run the notebook, execute the following command:
% jupyter-notebook MNIST-Convnet-demo.ipynb
The Keras model for the MNIST ConvNet classifier originates as an example on the Keras website and is given by the following Python code:
inputs = keras.Input(shape=(28,28,1),name="input")
x1 = layers.Conv2D(filters=16,kernel_size=3,activation="relu",name="conv2d_w1")(inputs)
x2 = layers.MaxPooling2D(pool_size=2,name="max_pooling2d_w2")(x1)
x3 = layers.Conv2D(filters=64,kernel_size=3,activation="relu",name= "conv2d_w3")(x2)
x4 = layers.MaxPooling2D(pool_size=2,name="max_pooling2d_w4")(x3)
x5 = layers.Conv2D(filters=128,kernel_size=3,activation="relu",name="conv2d_w5")(x4)
x6 = layers.Flatten(name="flatten_w6")(x5)
outputs = layers.Dense(10,activation="softmax",name="dense_w7")(x6)
model = keras.Model(inputs=inputs,outputs=outputs)
model.compile(optimizer="rmsprop",
loss="sparse_categorical_crossentropy",
metrics=["accuracy"])
The network contains seven layers:
The first layer is a 2D convolutional layer applying sixteen 3x3 convolutional kernels to the input layer, and applying a ReLU activation function at its output. Each kernel has its own set of 3 x 3 = 9 multiplicative weights along with a single additive bias weight. It follows the total weights for this layer is 16 x (9+1) = 160.
The second layer implements a “max pooling” layer that performs a decimation-by-two in both image dimensions. There are no weights associated with this layer.
The third layer implements another 2D convolutional layer this time with 64 3x3 convolutional kernels, and applying a ReLU activation function at its output. This layer involves a total of 16 x 64 x 9 = 9,216 multiplicative weights and 64 additive bias weights for a total of 9,280 weights.
The fourth layer implements another “max pooling” layer in two dimensions with no additional weights.
The fifth layer implements a third 2D convolutional layer with 128 3x3 convolutional kernels, and applying a ReLU activation at its output. This layer uses a total of 64 x 128 x 9 = 73,728 multiplicative weights and 128 additive bias weights for a total of 73,856 weights.
The sixth layer performs a flattening function, collapsing the total of 3 x 3 x 128 = 1,152 connections into a single 1D bus.
The seventh layer consists of a fully connected dense network of 1152 x 10 = 11,520 multiplicative weights and 10 additive bias weights for a total of 11,530 weights.
The total # of weights for this network is 94,826 or approximately 100,000 weights. This is a small ConvNet. The diagram below summarizes the layers of the MNIST ConvNet.