Hackerearth Deep Learning Challenge #1

Hackerearth hosted a Deep Learning challenge between September 1st to October 1st of 2017. The goal of the challenge was to create a deep learning model that would correctly classify food groceries from photos.

The mission statement of the challenge was to "help one of the largest retailers in Germany improve their inventory-management process in its Food and Groceries business. The company is looking for intelligent solutions that can reduce the amount of human effort in its warehouse and retail outlets. A solution such as a powerful image classifier can help the company track shelf inventory, categorize products, record product volume etc."

My submission for this challenge involved using a pre-trained Inception V3 model to perform transfer learning on this new dataset containing a different amount (and type) of output classes.

The trained model achieved an accuracy of 90.0% on the validation dataset, and 89.23% on the test set used for the official final rankings. This placed my submission in the 26th position, out of approximately 4,500 participants that entered the competition, and out of the 305 that were able to make a submission before the deadline.

The ranking can be verified on the public leaderboard here. It appears under the team name "Gradients of the Galaxy".

The data

The data consisted of 256x256 resolution images in png format. There were 3215 images for training, and 1732 for testing. For each image filename, there was a label in a CSV file that indicated what class of grocery item it is. The number of classes that needed to be identified as 25.

Below is a visualization of 9 samples from each of the class in the training data.

Image of training samples

A Jupyter Notebook with some of the exploratory data analysis performed can be viewed here.

Training

The model was trained for over 100 epochs, and which point no further improvements were being made. The training curves can be seen below.

Image of training curves

As can be seen from the training curves plot above, the model is suffering from high variance (overfitting). Unfortunately, I was not able to train another model with more aggressive regularization before the deadline.

The source code used for this submission can be viewed on Github.

Credits