Transfer Learning

Transfer Learning
Author: Mohammad Amaan Abbasi | June 18, 2019

Transfer Learning

It is a machine learning method in which a pretrained model is used as a starting point of some other task.

It is an effecient method to solve a given problem in less amount of time. Basically it makes use of already learnt features from a previous task and use those learnt features to solve a different task. This method of machine learning is commonly used in Computer vision and Natural Language Processing tasks.

Training a model from scratch is a time consuming task. In addition to that one might also need a large dataset to get statisfactory accuracy. This method also helps the model to generalize which in turn increases the validation accuracy.

While modifying we generally use learning rate smaller than the one used for initially training the model.

How To Use A Pretrained Model ?

  1. Choose a pretrained model There is a wide range of pretrained models to choose from. You should choose a neural model that is suitable for your problem. for example if you are creating a NN that classifies animals than you can choose a pretrained model that is trained on animal dataset.

Not everytime you will find a pretrained model that exactly matches your objectives. However there are a lot of general models that you can make use of.

A pretrained model might not be 100% accurate but it saves huge efforts required to re-invent the wheel.

  1. Remove the top layer

  2. Train the pretrained NN Make some of the layers trainable and make the others untrainable. Always make the top layers of the model trainable. Choosing how many layers to train depends on, how different or similar your dataset is to the dataset on which the pre-trained model is trained on and, also on how much data you have. There is no formula to decide on how many layers to train the model on.

There is however a general rule of thumb, that you may use while deciding it.

Large dataset, but different from the pre-trained model’s dataset: You have a large dataset but it is different from the dataset used for training the pre-trained model, in this case you can make use of the architecture of the model and train all of the layers of the pretrained model.

Large dataset and similar to the pre-trained model’s dataset: In a case where you have similar plus a large dataset you can we can save ourselves from a huge training effort by leveraging previous knowledge. Therefore, it should be enough to train the classifier and the top layers of the convolutional base.

Small dataset and different from the pre-trained model’s dataset: It will be hard to calculate number of layers in this case. Too much layers for training will lead to overfitting while training on less layers will lead to underfitting. In this situation you might want to do data augmentation on the dataset to increase the number of samples. Then train the model keeping first few layers unchanged.

Small dataset, but similar to the pre-trained model’s dataset: In this case you should only remove the last fully connected layer and run the pretrained model as a fixed extractor.

  1. Add more layers Adding more layers to the pretrained network can help the network better learn the features. You might want to play around with the numbers of layers to get to the point which works for you.

Coding it

Get the code from this github repo.

training.ipynb notebook contains the code for training the model.

But before going through the code, You should understand how the model is accessing the data.

Directory structure

- data
    |- training-data
        |- 0 
        |- 1
    |- testing-data
        |- 0
        |- 1

Here is how I have arranged the data, 0 contains the images with normal lung xrays while 1 contains the images for abnormal lung xrays. You can give any name other than 0 or 1 to these folders, keras sees the directory name as the label of the class.

from keras.applications.vgg16 import VGG16
from keras.layers import Dense
from keras.preprocessing.image import ImageDataGenerator


x = base_model.output
x = Dense(1024,activation='relu')(x) 
x = Dense(1024,activation='relu')(x) 
x = Dense(512,activation='relu')(x)
preds = Dense(2,activation='softmax')(x)

Importing the dependencies and loading the basemodel. include_top=False removes the top layer of the model.

for layer in model.layers[:20]:
for layer in model.layers[20:]:

This how you can set how many layers you want train. I have made the last 5 layers trainable for now.


valid_generator = train_datagen.flow_from_directory('data/testing-data',
                                                  target_size = (64,64),

Loading the data from the respective directories.

# Specify parameters
lr = 0.0001
epochs = 50

Adam = keras.optimizers.Adam(lr=lr)

step_size_valid = valid_generator.n//valid_generator.batch_size


Training the model and saving it to disk.