An app that groups images in a folder using transfer learned ResNet-50 model image classifier.
Managing an overcrowded gallery folder can be exhausting, especially when sorting through numerous unwanted photos. Consider this scenario:
"Mr. Joni, a businessman, frequently communicates with customers via WhatsApp, regularly exchanging both digital or printed bank payment receipts. Over time, these receipts have accumulated significantly, consuming his storage space. Payment receipts from three years ago are no longer relevant, yet Mr. Joni must now navigate through thousands of images in his "WhatsApp Image" folder to clear space."
Gallery Cleaner addresses challenges like Mr. Joni's by intelligently categorizing images within a folder into five distinct categories:
receipt
screenshot
people
food
landscape
The training process is done inside my campus' (Petra Christian University) accelerator. We call the server "gpu3". The server is powered by AMD Instinct MI120, a powerful GPU. Although it is strong, the AMD GPUs require additional configuration to work with TensorFlow. This is where comes into play. I had to install ROCm and document the process, since TensorFlow had never been used before on this server. I faced many problems related to version dependencies. In the end, I contributed to the campus server by publishing the documentation, so that other students can use TensorFlow with the GPU in the server.
This approach adopts only the ResNet-50 architecture with randomly initialized weights rather than pre-trained values. As expected, this model required significantly more training time to converge and struggled to reach high accuracy, ultimately achieving only about 75% validation accuracy. I realized that I needed another approach, this is when I met the transfer learning method.
I discovered that I could leverage a pre-trained ResNet50 model from ImageNet thanks to this excellent Kaggle resource: Cats or Dogs - using CNN with Transfer Learning.
In this implementation, Gabriel Preda successfully trained a model with:
I adapted this approach to my multi-class problem by:
Here are the training results:
The training results show the model avoided overfitting, but couldn't exceed 85% accuracy after 4 hours of training. I later discovered this frozen approach works best with:
This time, I didnt freeze the pretrained layers. Here is the result:
The test result shows that Transfer-learned Unfrozen Resnet-50 achieve ~99% in less than 1 hour of training. I decided to use this model type.
I further searched deeper about transfer learning. I feel like the current method is not enough, I suspected that the previous model was overfitting. I found out that Xception is known to be used widely for transfer learning and fine tuning. Here is the result:
I suspected that this model type was overfitting since the validation accuracy surpassed the training accuracy, albeit not significantly (only ~1%). Nevertheless, the model only reached an accuracy of ~95%, although it did so in less than 10 minutes of training. Hence, I decided to use the Transfer-learned Unfrozen Resnet-50 model.
I further apply the model to an application. I used Tkinter for the user interface. I also decided to build an .exe with PyInstaller since this system is more likely to be used locally. I faced a problem here. The application requires Tensorflow library to run. Asking user to install Tensorflow and activate the corresponding virtual environment is not possible. Compiling the whole Tensorflow library inside the .exe is also not efficient. Hence, I decided to use Tensorflow Lite. Here is how the application looks like:
people
and receipt
.The cleaning process starts. The application moves all the images to folders corresponding to their classes. Later, user can recheck the content of the folders and delete it later.
Here is the result of the cleaning process:From this project, I learned some valuable lessons: