How to teach a kid to design a deep learning model that can recognize Marvel superhuman

This article provides step by step guide to teach an 8-year-old kid designing a deep learning model to recognize Marvel superhuman characters..

Tuan Nghia Nguyen · 7/2/2019 1:02:34 AM

How to teach a kid to design a deep learning model that can recognize Marvel superhuman



Do you want to amaze your kids and give them motivation toward current artificial intelligence technology? DLHUB can be your great teaching tool to teach your kids designing a professional deep learning model that can recognize Marvel superhuman characters with great accuracy. In this article, we are going to use a transfer learning technique to design a deep learning classifier that can classify the images of Spiderman, Superman, and Wonderwomen. For more information on the transfer learning technique, please see our Fruit recognition post. 

Let get started.

1. Data preparation 

Preparing dataset for superhuman recognition is simple. What you need to do is to create a folder structure in the desired location that contains 3 sub-folders that name your kids' favorite Marvel characters as bellows:


In each folder, place appropriate images that belong to that character. In this example, we only need a small dataset which consists of only 10 training images for each Marvel character as shown below:


OK, you have placed correct images into correct sub-folder, but what is the image size or image type? Well, DLHUB supports three popular image types: jpeg (*.jpg), png (*.png), and bitmap (*.bmp), and it also takes care of image size automatically. Therefore, the hassle job has been handled by DLHUB for you. 

You would repeat this steps to create two folder structures: one for training set that is used to train a deep learning model, and one folder that contains completely different images used to test if the trained model can recognize corresponding Marvel character from his/her image. This structure should look like that.


2. What is transfer learning?

We said that we were going to use the transfer learning technique, but what exactly that technique means? The idea is to use a pre-trained deep learning model that successfully apply for similar image recognition problems. This pre-trained model has been trained with a very big dataset (up-to millions of data samples), so the layers from input to nearly the last output layers contain an amazing property that can extract the key features from an image. If we keep these previous layers of the pre-trained model and replace the last output layer by our new output layer that is suitable for our application and only that new output layer parameters are adjusted during the training process, it would be much faster and we will not very big dataset to achieve the desired accuracy.

There are numerous different types of pre-trained models for image recognition, but we will use the Residual Neural Network (ResNet18) from Microsoft. DLHUB will look for this pre-trained model in the following folder:


But I do not have this folder? 

Not an issue, DLHUB will check if this folder exists and start downloading all pre-trained models and place them into correct location automatically if required.



How about the image size for this pre-trained model, how do I know correct input shape?

When you choose to use any pre-trained model, DLHUB will detect correct image size and input shape and you only need to confirm.

Sound complicated? Let see how we can achieve this task in DLHUB.


3. Loading training dataset into DLHUB

With the training in the correct folder structure, loading it into DLHUB is easy as a piece of cake. First, we need to browse to correct training set folder (1), and then select the pre-trained model (2). That is it!

DLHUB will populate correct input shape in (3), and you just need to confirm if it is correct. In this example, we use ResNet18 with input shape 224x224x4 which is equivalent to a 224x224 color image as input, so all our training images will be automatically rescaled to correct ResNet18 size. Since we need to classify 3 Marvel characters, the number of categories will be 3. Image augmentation is a special image processing technique to improve training accuracy and it is supported in DLHUB.




Sound simple? How about building a deep learning model? Well, let see how DLHUB handle it.

4. Create deep learning model in DLHUB

We should expect something complicated when we design a deep learning model that uses the pre-trained model ResNet? Not at all. 

DLHUB makes it so simple just by choosing the TLModel (1) from the select function palette, selecting the correct pre-trained model from drop-out list (2), and add it into layer chain (3).



The output layer of the pre-trained model is removed by DLHUB automatically, so we do not need to worry about how to remove it. What we need to do is to add a new output layer that is suitable for our application. In this case, we chose a fully connected layer or Dense layer (1) and configure the correct number of nodes and activation function. As stated above, we need 3 nodes for 3 categories that are equivalent to 3 Marvel characters, and Softmax function is a great candidate for pattern recognition problem (2).



But I am not confident that my model is correct. 

DLHUB provide verification procedure to verify if your constructed deep learning model to be valid or not. Just press the "Verify model" button (4), and if the small bar becomes green then you are good to go. Otherwise, DLHUB suggests you how to correct your deep learning model.

5. Train deep learning model in DLHUB

Let see how DLHUB can handle the training process. First, you need to select a training algorithm, its parameters, loss function, accuracy type, and metric type in (1). You then need to select parameters to specify when to stop the training process in (2) before you can start training. 




Q: Wait a minute, I do know that the deep learning model can recognize my kids' favorite character via the learning process, but the above information does not mean anything to me at all.

A: OK, let me explain it to you. To make it simple, imagine that your kids will have the Naplan examination in the next 3 months, and you will have a plan to train your kids in 3 months so that they can pass that examination.

First, you need to find Naplan training books for your kids to learn all the Naplan subjects for the test. Since you do not know what is in the actual final Naplan test, but you guess that this final test should contain some key information and you will find all training books that cover this information. The purpose of the training is to teach your kids to learn this key information so that they can perform well in the final test. 

How do you do that? First, you need to guide your kids to gain their knowledge of Naplan subjects by studying training books, and each training book should have a series of mini-tests to verify your kids' knowledge. This process is called the training process or optimization process, and the mini-test knowledge that measures how well your kids understanding the Naplan subjects' concept is called the target loss function. If you kids fully understand the key Naplan subjects' concept, the loss function is minimum (zero). The approach that you teach your kids is a training algorithm, and the way you adjust this approach to be suitable for your kids is called training parameters. 

So the process will be repeated within 3 months until your kids fully understand all Naplan training materials. The method that helps measure how well your kids fully understand the key Naplan concepts are called validation, and the desired result from this learning process is called a training goal.

Assume that your plan to teach your kids 10 training books and each book contains 10 chapters that cover 10 different topics. To make your kids easily absorb the knowledge, your target is to let your kids learn few chapters (2 for example) called a batch size at a time. The total number of chapters is called a training epoch. If your kids understand all the Naplan concepts right after 1 epoch, it is really good. If not you ask your kids to continue their next epoch on the training books.

During the three-month training period, your kids' understanding Naplan subject getting better and better, that means the difference between their knowledge and the Naplan actual knowledge is getting smaller and smaller. This is represented by the training curve (3).

So, your kids will stop their training if the difference in knowledge meets the training goal or the maximum number of epochs.


Q: Sound interesting. Does that mean if my kids fully understand the training materials, they will get good results in the final Naplan test? 

A: Well, not quite. The content in the training materials might not cover all the topics in the final Naplan examination, and sometimes if you choose the training books that only focus on a few narrow topics, your kids will fail the final example that has broader topics. This is called over-fitting or overtrained. To avoid this issue, you need to do your work to find and select really good training books that cover as much as possible Naplan topics, and some other books not used as training books but used to validate your kids' knowledge during the training period. This process is called data preparation, and the other books that are not used to teach your kid but validate your kids' knowledge are called the validation set. if you select the good training books and apply validation tests during your training period, you kids might understand broader Naplan topics and get the good results in the final Naplan examination.    


Q: OK, it sounds easy to understand, but how do I choose the correct training algorithm and its parameters. How about stopping criteria parameters?

A: I do agree that it is hard to select the best training algorithm, but DLHUB provides default settings for both training algorithm, its parameters and stopping criteria. Please start with those settings. If you see the training curve goes up, then stop and reduce the Learning rate. Trial and error might give you the best result. Good luck!


6. Evaluate deep learning model in DLHUB

This is time for us to test our trained deep learning model with the test set that contains images which are not used in the training process, so our model has not seen these images before. Hopefully, it can pass the test.

To do that we need to browse the correct test folder (1) and perform the evaluation process by click on the evaluation button (2). The result shows that the model achieves around 87% of accuracy on 30 unseen test images.

Q: Only 87%? Can we improve accuracy?

A: Obviously, you can increase the accuracy by increasing the number of images in the training set that will cover all features that the test set possess.





7. Amaze your kids with a deep learning recognition model

You now can amaze your kids by the knowledge of the trained deep learning model. To see how the trained model can recognize the Marvel character, please browse to the test folder (1) to load all test images into Test Files list (2). You then select any image from that list and see if the model give a correct prediction on the Predicted Class (3). 

Congratulation, you have successfully taught your kid to design proper deep learning model that can recognize his/her favorite superhuman.

Q: Wait, can we choose to recognize different objects?

A: Sure, you can apply the same technique and change your training set that contains other objects such as cats or dogs, and the trained model will be able to recognize these objects.  




8. How do I try myself? Where do I download DLHUB? Is that free to use?

Yes, you can download the latest DLHUB version from our website via Download page:

You can start using DLHUB by your account registered in ANSCENTER website.

DLHUB is completely free to use, only when you need to deploy the trained model to another programming environment then you need to pay for correct product license. 





Related Blogs

Keep in touch with the latest blogs & news.

Be the first to post a comment.

Please login to add your comment.

ANS Center provides elegent solutions to simplify machine learning and deep learning design and deployment.

We support mutiple hardware platforms and programming languages, including LabVIEW, LabVIEW NXG, LabWindow CVI, C/C++/C#, and Arduino.