Convolution Neural Network for MNIST hand writing recognition

This article provides step by step on how to design a convolution neural network (LeNet) in the deep learning DLHUB software for MINST hand-writing recognition application..

Tuan Nghia Nguyen · 6/28/2019 2:33:36 PM

Convolution Neural Network for MNIST hand writing recognition

MNIST_Handwriting_Recognition_DLHUB.jpg

 

 

There are so many tutorials on MNIST hand-writing recognition existing on the internet. However, these tutorials mainly use deep learning platforms such as Tensorflow, CNTK, MXNet, PyTorch, Caffe, and so on. The thing is to design proper deep learning neural network model, not only does a user require deep knowledge on that AI field, but also has Python programming skill and be familiar with one of the deep learning platform application programming interface (API). The purpose of this article is to introduce a new way to design a deep learning model without requiring the Python programming skill, understanding complicated deep learning APIs, and deep knowledge of deep learning architecture. DLHUB is designed to simplify this design process with just a few clicks.

 

 

 

1. Obtain the MNIST dataset.

 

You can download the MINST dataset directly from http://yann.lecun.com/exdb/mnist. This dataset consists of 60,000 samples in a training set and 10,000 samples in a test set. It is a subset of a larger dataset available from the National Institute of Standards and Technology (NIST). However, there is an easier way to get the MNIST dataset directly from DLHUB as DLHUB equipped with ready to use examples to allow you exploring its features. 

 

When you first launch DLHUB, DLHUB will show a load training data interface. To get the DLHUB examples, you need to click on the Help button (1) to show a Training Data File Help that allows you to download DLHUB examples (2). This Train Data File Help dialog will also guide you correct format for different types of dataset.

Download_DLHUB_Examples.jpg

 

After  DLHUB examples.zip is downloaded, please extract this folder using unzip utility, WinRar for example, and place this Examples folder into DLHUBData location as shown bellows:

DLHUB_Examples_location.jpg 

 

Since MNIST data is handwriting 28x28x1 image to represent a number from 0 to 9, there will be 10 different labels to encode a certain number from 0 to 9 that is corresponding to the 28x28x1 image. This 28x28x1 image can be flattened into an array with 784 features, and each feature is equivalent to the pixel value of this image. To encode the output number (from 0 to 9), we can use 10 binary code. For example, 0 0 0 0 0 0 0 0 0 1 will be equivalent to number 0,and 1 0 0 0 0 0 0 0 0 0 is for number 9. Therefore, MINIST dataset will look like as bellows:

 

 MINIST_Dataset.jpg

2. Loading MNIST dataset into DLHUB

 

Once MINIST data is in the right format, it is ready to be loaded into DLHUB in the loading data page. On that page, you can browse to correct MINIST dataset location (1), confirm the correct dataset input shape (28x28x1) (2), and click next (3) to go to the deep learning design page.

 

Loading_MINIST_data_into_DLHUB.jpg

 

3. Design a Convolution Neural Network deep learning model for MNIST application in DLHUB

 

 In order to compare how easy it is to design a deep learning model in DLHUB, a famous Python code in (https://github.com/keras-team/keras/blob/master/examples/mnist_cnn.py) Keras is used in this article. As you know that because of the complexity of Python deep learning APIs, it is not easy to design a deep learning model directly from other deep learning platforms (Tensorflow, CNTK, MXNET....), so Keras (https://keras.io/) is designed to provide a high-level wrap around popular deep learning APIs in Python.

 

LeNet originally introduced by LeCun in 1998 is one of the famous Convolution Neural Network for image recognition. This LeNet structure is shown as bellows:

MNIST_LeNet_Architecture.jpg

 

   

In LeNet, an image was filtered via a few trainable convolutions and pooling layers before being flattened out to feed to fully connected layers. With this structure, this image features are revealed after being filtered by convolution and pooling layers. As a result, the classification accuracy is significantly improved compared with the traditional machine learning model which contains only fully connected layers. For more information, please look into LeCun's publication.

 

Since LeNet was introduced, there have been several variations in its structure to improve classification accuracy. In this article, we use the architecture introduced in Keras for comparison purposes. The Keras python code to construct LeNet model is shown as follows:

'''Trains a simple convnet on the MNIST dataset.
Gets to 99.25% test accuracy after 12 epochs
(there is still a lot of margin for parameter tuning).
16 seconds per epoch on a GRID K520 GPU.
'''

from __future__ import print_function
import keras
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense, Dropout, Flatten
from keras.layers import Conv2D, MaxPooling2D
from keras import backend as K

batch_size = 128
num_classes = 10
epochs = 12

# input image dimensions
img_rows, img_cols = 28, 28

# the data, split between train and test sets
(x_train, y_train), (x_test, y_test) = mnist.load_data()

if K.image_data_format() == 'channels_first':
    x_train = x_train.reshape(x_train.shape[0], 1, img_rows, img_cols)
    x_test = x_test.reshape(x_test.shape[0], 1, img_rows, img_cols)
    input_shape = (1, img_rows, img_cols)
else:
    x_train = x_train.reshape(x_train.shape[0], img_rows, img_cols, 1)
    x_test = x_test.reshape(x_test.shape[0], img_rows, img_cols, 1)
    input_shape = (img_rows, img_cols, 1)

x_train = x_train.astype('float32')
x_test = x_test.astype('float32')
x_train /= 255
x_test /= 255
print('x_train shape:', x_train.shape)
print(x_train.shape[0], 'train samples')
print(x_test.shape[0], 'test samples')

# convert class vectors to binary class matrices
y_train = keras.utils.to_categorical(y_train, num_classes)
y_test = keras.utils.to_categorical(y_test, num_classes)

model = Sequential()
model.add(Conv2D(32, kernel_size=(3, 3),
                 activation='relu',
                 input_shape=input_shape))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(num_classes, activation='softmax'))

model.compile(loss=keras.losses.categorical_crossentropy,
              optimizer=keras.optimizers.Adadelta(),
              metrics=['accuracy'])

model.fit(x_train, y_train,
          batch_size=batch_size,
          epochs=epochs,
          verbose=1,
          validation_data=(x_test, y_test))
score = model.evaluate(x_test, y_test, verbose=0)
print('Test loss:', score[0])
print('Test accuracy:', score[1])

 

 

The same LetNet architecture can be easily constructed in DLHUB with just a few clicks. First, create a new file to store LeNet structure (1), then select the appropriate layer from the Select Functions palette (2) and configure the selected layer parameters in (3). The process is repeated until the correct LeNet model is constructed (red rectangular area). Once the model is constructed, a verification button (4) is used to check if the model is constructed correctly before the further process (6). The LetNet structure can be saved into a file at any time during the design process.

 

 

Construct_LeNet_In_DLHUB.jpg

 

 

 

 

4. Train a Convolution Neural Network deep learning model for MNIST application in DLHUB

 

Configure training parameters in DLHUB is so simple with easy to use graphical user interface. In this article, we use the sample training algorithm and parameters from the Keras examples. First, we select the training algorithm and loss function and accuracy, metric types, and training parameters (1), then we can specify when to stop the training process (2) before starting it (3). If the GPU is detected in the host PC, it will be used to improve the training process.

 

Train_LeNET_In_DLHUB.jpg

 

The training process is finished after 12 epochs, and it takes only 72 seconds to train a big dataset in the Alienware 15 R3 laptop.

 

5. Evaluate a Convolution Neural Network deep learning model for MNIST application in DLHUB

 

In order to test the trained deep learning performance, the evaluation process is included in DLHUB. It is can easily be done by importing a test dataset (10000 unseen samples) from a correct format file (1) and performing the evaluation on that test dataset (2). The accuracy of that 10k dataset is around 99%, depending on how a user specifies a learning rate defined in step 4.

 

MNIST_Evaluartion_in_DLHUB.jpg

 

 

6. Test the trained model with a new data set.

User can also test how a trained model work right inside DLHUB before deciding to use this trained model in actual production/deployment applications. It is easily done by loading new data folder (1), selecting data file to evaluate (2), visualizing how a new data look like if it is an image (3), and confirming correct prediction result (4).  

 

Test_new_MNIST_Data_in_DLHUB.jpg

 

 

 

 

 Conclusion

 

In this article, we have demonstrated how easy to design a deep learning Convolution Neural Network using DLHUB software. No programming skill and deep knowledge of deep learning are required. The configuration, training and evaluation processes in DLHUB are quick and simple.

The training speed is faster compared to other Python platforms as DLHUB does not need Python interpreter.  

   

 

 

 

Related Blogs

Keep in touch with the latest blogs & news.

Be the first to post a comment.

Please login to add your comment.

ANS Center provides elegent solutions to simplify machine learning and deep learning design and deployment.

We support mutiple hardware platforms and programming languages, including LabVIEW, LabVIEW NXG, LabWindow CVI, C/C++/C#, and Arduino.