The purpose of this application is to design a Neural Network that could recognize the number from 1 to 9 from the hand-writing gray image (28 pixels by 28 pixels) as shown in Figure 1.
Figure 1: Overview of the handwriting recognition application.
Although handwriting recognition based on MNIST database is not well-suited for machine learning experiments, this application is selected to demonstrate that ANNHub is able to cope with large dataset application, and the overall accuracy could achieve around 90%.
The MNIST database, which can be obtained from http://yann.lecun.com/exdb/mnist/, consists of 60,000 samples in a training set and 10,000 samples in a test set. It is a subset of a larger dataset available from National Institute of Standards and Technology (NIST).
Figure 2: Handwriting recognition dataset.
The first step is to prepare the MNIST data into a supported format that can be loaded into ANNHub. Since an image in the MNIST database is in 28x28 grey-scale, it can be presented in 2D array (28x28) and its element values are within a [0;255] range. As the input layer of the Neural Network only accepts a 1D array, it requires to flatten 2D array into the 1D array with the length = 28x28 = 784. The output of the Neural Network is a number, from 1 to 9, that is corresponding with an image input. Figure 4.8 shows the format of the MNIST dataset in a "csv" file.
The first 784 columns are for image input, and the last column is for output (target) value. Each row represents a sample in the MNIST database. The dataset that includes both training dataset and test dataset in the csv format can be found in the ANNHub installation folder (Examples>Classification Examples>MNIST)
Figure 3: MNIST dataset files, training and test sets, in csv format.
Load training dataset into ANNHub
Figure 4: Load MNIST training dataset
After datasets are prepared, training dataset will be loaded into ANNHub in Step 1 in Figure 4. In this step, only a fraction of the dataset is loaded so that it gives ANNHub enough information about the dataset format that assists to configure a recommended Neural Network structure.
Configure Neural Network
Figure 5: Load MNIST training dataset
Based on the training dataset, the recommended structure of the Neural Network is configured as shown in Figure 5. However, users can still tweak to achieve a better result. In this example, the Scaled Conjugate Gradient training algorithm is used. The cross-entropy is used as a cost function. The Neural Network structure that has 784 input nodes, 20 hidden nodes, and 1 output node is configured. The activation function for the hidden layer is Tansig, and Softmax is used as the activation function for the output layer. Max min-max method is used for both pre-processing and post-processing. The training data ratio of 75%.
Train Neural Network
Figure 6: Train the Neural Network to learn MNIST features.
As shown in Figure 6, the Scaled Conjugate Gradient is used, the early stopping technique that utilizes the validation set to determine the stopping location is automatically configured and applied during the training procedure. The stopping criteria that includes 1 max fails, 0.0001 for training goal, 0.001 for gradient goal, and 300 epochs. The training process takes around 21 minutes to complete.
A better result could be achieved by tweaking the Neural Network structure, training algorithm, and its parameters.
Evaluate the trained Neural Network
Figure 7: Evaluate the trained Neural Network.
After the Neural Network is being trained, confusion matrix and ROC curve techniques are shown in Figure 7 are used to evaluate its performance. Both training set, validation set and test set are used in the evaluation. As shown in Figure 6, some classes (class 1 corresponds to output that has a value as 1) have better accuracy than other classes, but the overall accuracy will still achieve around 95%.
Test trained Neural Network with a new dataset
Figure 8: Test the trained Neural Network with new test dataset.
Before being deployed into a real application, the trained Neural Network can be tested with a new dataset to confirm its generalization. The test dataset contains10,000 samples that have not been used during the design process described above. As can be seen in Figure 8, the trained Neural Network still can recognize correct numbers from samples in the test dataset with accuracy rate of around 90% (with very strict threshold as 0.3, that means if the predicted result is in [1.31;1.69] range, the Neural Network will make false prediction it this image is 1 or 2). If the threshold is set to 0.49, then the overall accuracy will be 93.73%.
Deploy the trained Neural Network in Handwriting recognition application
The deployment in different programming environments can be easily done thanks to their APIs provided by ANS Center. In this example, the deployment of the trained Neural Network is in the LabVIEW environment by using ANNAPI for LabVIEW. The trained Neural Network model is exported to a file with ".ann" extension. The LabVIEW code to load the trained Neural Network model and perform prediction is shown as follows,
Figure 9: LabVIEW Block diagram to deploy trained Neural Network in handwriting application
Figure 10: Standalone handwriting application that use trained Neural Network to classify handwriting images