Author: Pratik Sharma

Project 8 - Computer Vision - Face Detection

In this hands-on project, the goal is to build a face detection model which includes building a face detector to locate the position of a face in an image.

Wider Face Dataset

Wider Face dataset is a face detection benchmark dataset, of which images are selected from the publicly available WIDER dataset. This data have 32,203 images and 393,703 faces are labeled with a high degree of variability in scale, pose and occlusion as depicted in the sample images.

In this project, we are using 409 images and around 1000 faces for ease of computation.

We will be using transfer learning on an already trained model to build our detector. We will perform transfer learning on Mobile Net model which is already trained to perform object detection. We will need to train the last 6-7 layers and freeze the remaining layers to train the model for face detection. To be able to train the Mobile Net model for face detection, we will be using WIDER FACE dataset which already has the bounding box data for various images with a single face and multiple faces. The output of the model is the bounding box data which gives the location of the face in an image. We learn to build a face detection model using Keras supported by Tensorflow.

Acknowledgement for the datasets

Acknowledgment for the datasets. http://mmlab.ie.cuhk.edu.hk/projects/WIDERFace/

Mobile Net paper: https://arxiv.org/pdf/1704.04861.pdf

Objective of the project

In this problem, we use "Transfer Learning" of an Object Detector model to detect any object according to the problem in hand. Here, we are particularly interested in detecting faces in a given image.

Face detection

Task is to predict the boundaries(mask) around the face in a given image.

Dataset

Faces in images marked with bounding boxes. Have around 500 images with around 1100 faces manually tagged via bounding box.

In [1]:
# Mounting Google Drive
from google.colab import drive
drive.mount('/content/drive')
Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).
In [0]:
# Setting the current working directory
import os; os.chdir('drive/My Drive/Great Learning/Computer Vision Project 1')

Import Packages

In [3]:
# Imports
import pandas as pd, numpy as np, matplotlib.pyplot as plt
from matplotlib import pyplot
%matplotlib inline

# Create features and labels
from tensorflow.keras.applications.mobilenet import preprocess_input
import cv2

# Model
from tensorflow.keras.callbacks import ModelCheckpoint, EarlyStopping, ReduceLROnPlateau
from tensorflow.keras.layers import Concatenate, UpSampling2D, Conv2D, Reshape, Activation, BatchNormalization, SpatialDropout2D
from tensorflow.keras.applications.mobilenet import MobileNet
from sklearn.model_selection import train_test_split
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.models import Model
import tensorflow as tf

# to define loss
from tensorflow.keras.losses import binary_crossentropy
from tensorflow.keras.backend import log, epsilon

The default version of TensorFlow in Colab will soon switch to TensorFlow 2.x.
We recommend you upgrade now or ensure your notebook will continue to use TensorFlow 1.x via the %tensorflow_version 1.x magic: more info.

Load the 'images.npy' file

  • This file contains images with details of bounding boxes
In [4]:
!ls
'08_Computer Vision_Face Detection.ipynb'   images.npy	 model_1.45.h5
In [0]:
#Reference: https://stackoverflow.com/questions/55890813/how-to-fix-object-arrays-cannot-be-loaded-when-allow-pickle-false-for-imdb-loa
np_load_old = np.load

# modify the default parameters of np.load
np.load = lambda *a,**k: np_load_old(*a, allow_pickle = True, **k)

data = np.load('images.npy')

Check one sample from the loaded file

In [6]:
fig = plt.figure(figsize = (15, 7.2))
ax = fig.add_subplot(1, 1, 1)
plt.axis('off')
plt.imshow(data[10][0])
plt.show()

Set image dimensions

  • Initialize image height, image width with value: 224
  • Alpha: 1
In [0]:
ALPHA = 1
IMAGE_SIZE = 224
IMAGE_HEIGHT = 224
IMAGE_WIDTH = 224

Create features and labels

  • Here feature is the image
  • The label is the mask
  • Images will be stored in 'X' array
  • Masks will be stored in 'masks' array
In [0]:
masks = np.zeros((int(data.shape[0]), IMAGE_HEIGHT, IMAGE_WIDTH))
X = np.zeros((int(data.shape[0]), IMAGE_HEIGHT, IMAGE_WIDTH, 3))
for index in range(data.shape[0]):
    img = data[index][0]
    img = cv2.resize(img, dsize = (IMAGE_HEIGHT, IMAGE_WIDTH), interpolation = cv2.INTER_CUBIC)
    try:
      img = img[:, :, :3]
    except:
      continue
    X[index] = preprocess_input(np.array(img, dtype = np.float32))
    for i in data[index][1]:
        x1 = int(i['points'][0]['x'] * IMAGE_WIDTH)
        x2 = int(i['points'][1]['x'] * IMAGE_WIDTH)
        y1 = int(i['points'][0]['y'] * IMAGE_HEIGHT)
        y2 = int(i['points'][1]['y'] * IMAGE_HEIGHT)
        masks[index][y1:y2, x1:x2] = 1
In [9]:
X.shape
Out[9]:
(409, 224, 224, 3)
In [10]:
masks.shape
Out[10]:
(409, 224, 224)
In [11]:
n = 10
fig = plt.figure(figsize = (15, 7.2))
ax = fig.add_subplot(1, 1, 1)
plt.axis('off')
_ = plt.imshow(X[n])
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
In [12]:
n = 10
fig = plt.figure(figsize = (15, 7.2))
ax = fig.add_subplot(1, 1, 1)
plt.axis('off')
_ = plt.imshow(masks[n])

Create the model

  • Add MobileNet as model with below parameter values
    • input_shape: IMAGE_HEIGHT, IMAGE_WIDTH, 3
    • include_top: False
    • alpha: 1.0
    • weights: 'imagenet'
  • Add UNET architecture layers
In [0]:
def conv_block_simple(prevlayer, filters, prefix, strides=(1, 1)):
    conv = Conv2D(filters, (3, 3), padding = 'same', kernel_initializer = 'he_normal', strides = strides, name = prefix + '_conv')(prevlayer)
    conv = BatchNormalization(name = prefix + 'BatchNormalization')(conv)
    conv = Activation('relu', name = prefix + 'ActivationLayer')(conv)
    return conv

def create_model(trainable = True):
    model = MobileNet(input_shape = (IMAGE_HEIGHT, IMAGE_WIDTH, 3), include_top = False, alpha = ALPHA, weights = 'imagenet')
    for layer in model.layers:
        layer.trainable = trainable
    
    block1 = model.get_layer('conv_pw_13_relu').output
    block2 = model.get_layer('conv_pw_11_relu').output
    block3 = model.get_layer('conv_pw_5_relu').output
    block4 = model.get_layer('conv_pw_3_relu').output
    block5 = model.get_layer('conv_pw_1_relu').output
    
    up1 = Concatenate()([UpSampling2D()(block1), block2])
    conv6 = conv_block_simple(up1, 256, 'Conv_6_1')
    conv6 = conv_block_simple(conv6, 256, 'Conv_6_2')

    up2 = Concatenate()([UpSampling2D()(conv6), block3])
    conv7 = conv_block_simple(up2, 256, 'Conv_7_1')
    conv7 = conv_block_simple(conv7, 256, 'Conv_7_2')

    up3 = Concatenate()([UpSampling2D()(conv7), block4])
    conv8 = conv_block_simple(up3, 192, 'Conv_8_1')
    conv8 = conv_block_simple(conv8, 128, 'Conv_8_2')

    up4 = Concatenate()([UpSampling2D()(conv8), block5])
    conv9 = conv_block_simple(up4, 96, 'Conv_9_1')
    conv9 = conv_block_simple(conv9, 64, 'Conv_9_2')

    up5 = Concatenate()([UpSampling2D()(conv9), model.input])
    conv10 = conv_block_simple(up5, 48, 'Conv_10_1')
    conv10 = conv_block_simple(conv10, 32, 'Conv_10_2')
    conv10 = SpatialDropout2D(0.2)(conv10)
    
    x = Conv2D(1, (1, 1), activation = 'sigmoid')(conv10)
    x = Reshape((IMAGE_SIZE, IMAGE_SIZE))(x)
    return Model(inputs = model.input, outputs = x)

Call the create_model function

In [14]:
model = create_model(True)
model.summary()
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/ops/resource_variable_ops.py:1630: calling BaseResourceVariable.__init__ (from tensorflow.python.ops.resource_variable_ops) with constraint is deprecated and will be removed in a future version.
Instructions for updating:
If using Keras pass *_constraint arguments to layers.
Model: "model"
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
==================================================================================================
input_1 (InputLayer)            [(None, 224, 224, 3) 0                                            
__________________________________________________________________________________________________
conv1_pad (ZeroPadding2D)       (None, 225, 225, 3)  0           input_1[0][0]                    
__________________________________________________________________________________________________
conv1 (Conv2D)                  (None, 112, 112, 32) 864         conv1_pad[0][0]                  
__________________________________________________________________________________________________
conv1_bn (BatchNormalization)   (None, 112, 112, 32) 128         conv1[0][0]                      
__________________________________________________________________________________________________
conv1_relu (ReLU)               (None, 112, 112, 32) 0           conv1_bn[0][0]                   
__________________________________________________________________________________________________
conv_dw_1 (DepthwiseConv2D)     (None, 112, 112, 32) 288         conv1_relu[0][0]                 
__________________________________________________________________________________________________
conv_dw_1_bn (BatchNormalizatio (None, 112, 112, 32) 128         conv_dw_1[0][0]                  
__________________________________________________________________________________________________
conv_dw_1_relu (ReLU)           (None, 112, 112, 32) 0           conv_dw_1_bn[0][0]               
__________________________________________________________________________________________________
conv_pw_1 (Conv2D)              (None, 112, 112, 64) 2048        conv_dw_1_relu[0][0]             
__________________________________________________________________________________________________
conv_pw_1_bn (BatchNormalizatio (None, 112, 112, 64) 256         conv_pw_1[0][0]                  
__________________________________________________________________________________________________
conv_pw_1_relu (ReLU)           (None, 112, 112, 64) 0           conv_pw_1_bn[0][0]               
__________________________________________________________________________________________________
conv_pad_2 (ZeroPadding2D)      (None, 113, 113, 64) 0           conv_pw_1_relu[0][0]             
__________________________________________________________________________________________________
conv_dw_2 (DepthwiseConv2D)     (None, 56, 56, 64)   576         conv_pad_2[0][0]                 
__________________________________________________________________________________________________
conv_dw_2_bn (BatchNormalizatio (None, 56, 56, 64)   256         conv_dw_2[0][0]                  
__________________________________________________________________________________________________
conv_dw_2_relu (ReLU)           (None, 56, 56, 64)   0           conv_dw_2_bn[0][0]               
__________________________________________________________________________________________________
conv_pw_2 (Conv2D)              (None, 56, 56, 128)  8192        conv_dw_2_relu[0][0]             
__________________________________________________________________________________________________
conv_pw_2_bn (BatchNormalizatio (None, 56, 56, 128)  512         conv_pw_2[0][0]                  
__________________________________________________________________________________________________
conv_pw_2_relu (ReLU)           (None, 56, 56, 128)  0           conv_pw_2_bn[0][0]               
__________________________________________________________________________________________________
conv_dw_3 (DepthwiseConv2D)     (None, 56, 56, 128)  1152        conv_pw_2_relu[0][0]             
__________________________________________________________________________________________________
conv_dw_3_bn (BatchNormalizatio (None, 56, 56, 128)  512         conv_dw_3[0][0]                  
__________________________________________________________________________________________________
conv_dw_3_relu (ReLU)           (None, 56, 56, 128)  0           conv_dw_3_bn[0][0]               
__________________________________________________________________________________________________
conv_pw_3 (Conv2D)              (None, 56, 56, 128)  16384       conv_dw_3_relu[0][0]             
__________________________________________________________________________________________________
conv_pw_3_bn (BatchNormalizatio (None, 56, 56, 128)  512         conv_pw_3[0][0]                  
__________________________________________________________________________________________________
conv_pw_3_relu (ReLU)           (None, 56, 56, 128)  0           conv_pw_3_bn[0][0]               
__________________________________________________________________________________________________
conv_pad_4 (ZeroPadding2D)      (None, 57, 57, 128)  0           conv_pw_3_relu[0][0]             
__________________________________________________________________________________________________
conv_dw_4 (DepthwiseConv2D)     (None, 28, 28, 128)  1152        conv_pad_4[0][0]                 
__________________________________________________________________________________________________
conv_dw_4_bn (BatchNormalizatio (None, 28, 28, 128)  512         conv_dw_4[0][0]                  
__________________________________________________________________________________________________
conv_dw_4_relu (ReLU)           (None, 28, 28, 128)  0           conv_dw_4_bn[0][0]               
__________________________________________________________________________________________________
conv_pw_4 (Conv2D)              (None, 28, 28, 256)  32768       conv_dw_4_relu[0][0]             
__________________________________________________________________________________________________
conv_pw_4_bn (BatchNormalizatio (None, 28, 28, 256)  1024        conv_pw_4[0][0]                  
__________________________________________________________________________________________________
conv_pw_4_relu (ReLU)           (None, 28, 28, 256)  0           conv_pw_4_bn[0][0]               
__________________________________________________________________________________________________
conv_dw_5 (DepthwiseConv2D)     (None, 28, 28, 256)  2304        conv_pw_4_relu[0][0]             
__________________________________________________________________________________________________
conv_dw_5_bn (BatchNormalizatio (None, 28, 28, 256)  1024        conv_dw_5[0][0]                  
__________________________________________________________________________________________________
conv_dw_5_relu (ReLU)           (None, 28, 28, 256)  0           conv_dw_5_bn[0][0]               
__________________________________________________________________________________________________
conv_pw_5 (Conv2D)              (None, 28, 28, 256)  65536       conv_dw_5_relu[0][0]             
__________________________________________________________________________________________________
conv_pw_5_bn (BatchNormalizatio (None, 28, 28, 256)  1024        conv_pw_5[0][0]                  
__________________________________________________________________________________________________
conv_pw_5_relu (ReLU)           (None, 28, 28, 256)  0           conv_pw_5_bn[0][0]               
__________________________________________________________________________________________________
conv_pad_6 (ZeroPadding2D)      (None, 29, 29, 256)  0           conv_pw_5_relu[0][0]             
__________________________________________________________________________________________________
conv_dw_6 (DepthwiseConv2D)     (None, 14, 14, 256)  2304        conv_pad_6[0][0]                 
__________________________________________________________________________________________________
conv_dw_6_bn (BatchNormalizatio (None, 14, 14, 256)  1024        conv_dw_6[0][0]                  
__________________________________________________________________________________________________
conv_dw_6_relu (ReLU)           (None, 14, 14, 256)  0           conv_dw_6_bn[0][0]               
__________________________________________________________________________________________________
conv_pw_6 (Conv2D)              (None, 14, 14, 512)  131072      conv_dw_6_relu[0][0]             
__________________________________________________________________________________________________
conv_pw_6_bn (BatchNormalizatio (None, 14, 14, 512)  2048        conv_pw_6[0][0]                  
__________________________________________________________________________________________________
conv_pw_6_relu (ReLU)           (None, 14, 14, 512)  0           conv_pw_6_bn[0][0]               
__________________________________________________________________________________________________
conv_dw_7 (DepthwiseConv2D)     (None, 14, 14, 512)  4608        conv_pw_6_relu[0][0]             
__________________________________________________________________________________________________
conv_dw_7_bn (BatchNormalizatio (None, 14, 14, 512)  2048        conv_dw_7[0][0]                  
__________________________________________________________________________________________________
conv_dw_7_relu (ReLU)           (None, 14, 14, 512)  0           conv_dw_7_bn[0][0]               
__________________________________________________________________________________________________
conv_pw_7 (Conv2D)              (None, 14, 14, 512)  262144      conv_dw_7_relu[0][0]             
__________________________________________________________________________________________________
conv_pw_7_bn (BatchNormalizatio (None, 14, 14, 512)  2048        conv_pw_7[0][0]                  
__________________________________________________________________________________________________
conv_pw_7_relu (ReLU)           (None, 14, 14, 512)  0           conv_pw_7_bn[0][0]               
__________________________________________________________________________________________________
conv_dw_8 (DepthwiseConv2D)     (None, 14, 14, 512)  4608        conv_pw_7_relu[0][0]             
__________________________________________________________________________________________________
conv_dw_8_bn (BatchNormalizatio (None, 14, 14, 512)  2048        conv_dw_8[0][0]                  
__________________________________________________________________________________________________
conv_dw_8_relu (ReLU)           (None, 14, 14, 512)  0           conv_dw_8_bn[0][0]               
__________________________________________________________________________________________________
conv_pw_8 (Conv2D)              (None, 14, 14, 512)  262144      conv_dw_8_relu[0][0]             
__________________________________________________________________________________________________
conv_pw_8_bn (BatchNormalizatio (None, 14, 14, 512)  2048        conv_pw_8[0][0]                  
__________________________________________________________________________________________________
conv_pw_8_relu (ReLU)           (None, 14, 14, 512)  0           conv_pw_8_bn[0][0]               
__________________________________________________________________________________________________
conv_dw_9 (DepthwiseConv2D)     (None, 14, 14, 512)  4608        conv_pw_8_relu[0][0]             
__________________________________________________________________________________________________
conv_dw_9_bn (BatchNormalizatio (None, 14, 14, 512)  2048        conv_dw_9[0][0]                  
__________________________________________________________________________________________________
conv_dw_9_relu (ReLU)           (None, 14, 14, 512)  0           conv_dw_9_bn[0][0]               
__________________________________________________________________________________________________
conv_pw_9 (Conv2D)              (None, 14, 14, 512)  262144      conv_dw_9_relu[0][0]             
__________________________________________________________________________________________________
conv_pw_9_bn (BatchNormalizatio (None, 14, 14, 512)  2048        conv_pw_9[0][0]                  
__________________________________________________________________________________________________
conv_pw_9_relu (ReLU)           (None, 14, 14, 512)  0           conv_pw_9_bn[0][0]               
__________________________________________________________________________________________________
conv_dw_10 (DepthwiseConv2D)    (None, 14, 14, 512)  4608        conv_pw_9_relu[0][0]             
__________________________________________________________________________________________________
conv_dw_10_bn (BatchNormalizati (None, 14, 14, 512)  2048        conv_dw_10[0][0]                 
__________________________________________________________________________________________________
conv_dw_10_relu (ReLU)          (None, 14, 14, 512)  0           conv_dw_10_bn[0][0]              
__________________________________________________________________________________________________
conv_pw_10 (Conv2D)             (None, 14, 14, 512)  262144      conv_dw_10_relu[0][0]            
__________________________________________________________________________________________________
conv_pw_10_bn (BatchNormalizati (None, 14, 14, 512)  2048        conv_pw_10[0][0]                 
__________________________________________________________________________________________________
conv_pw_10_relu (ReLU)          (None, 14, 14, 512)  0           conv_pw_10_bn[0][0]              
__________________________________________________________________________________________________
conv_dw_11 (DepthwiseConv2D)    (None, 14, 14, 512)  4608        conv_pw_10_relu[0][0]            
__________________________________________________________________________________________________
conv_dw_11_bn (BatchNormalizati (None, 14, 14, 512)  2048        conv_dw_11[0][0]                 
__________________________________________________________________________________________________
conv_dw_11_relu (ReLU)          (None, 14, 14, 512)  0           conv_dw_11_bn[0][0]              
__________________________________________________________________________________________________
conv_pw_11 (Conv2D)             (None, 14, 14, 512)  262144      conv_dw_11_relu[0][0]            
__________________________________________________________________________________________________
conv_pw_11_bn (BatchNormalizati (None, 14, 14, 512)  2048        conv_pw_11[0][0]                 
__________________________________________________________________________________________________
conv_pw_11_relu (ReLU)          (None, 14, 14, 512)  0           conv_pw_11_bn[0][0]              
__________________________________________________________________________________________________
conv_pad_12 (ZeroPadding2D)     (None, 15, 15, 512)  0           conv_pw_11_relu[0][0]            
__________________________________________________________________________________________________
conv_dw_12 (DepthwiseConv2D)    (None, 7, 7, 512)    4608        conv_pad_12[0][0]                
__________________________________________________________________________________________________
conv_dw_12_bn (BatchNormalizati (None, 7, 7, 512)    2048        conv_dw_12[0][0]                 
__________________________________________________________________________________________________
conv_dw_12_relu (ReLU)          (None, 7, 7, 512)    0           conv_dw_12_bn[0][0]              
__________________________________________________________________________________________________
conv_pw_12 (Conv2D)             (None, 7, 7, 1024)   524288      conv_dw_12_relu[0][0]            
__________________________________________________________________________________________________
conv_pw_12_bn (BatchNormalizati (None, 7, 7, 1024)   4096        conv_pw_12[0][0]                 
__________________________________________________________________________________________________
conv_pw_12_relu (ReLU)          (None, 7, 7, 1024)   0           conv_pw_12_bn[0][0]              
__________________________________________________________________________________________________
conv_dw_13 (DepthwiseConv2D)    (None, 7, 7, 1024)   9216        conv_pw_12_relu[0][0]            
__________________________________________________________________________________________________
conv_dw_13_bn (BatchNormalizati (None, 7, 7, 1024)   4096        conv_dw_13[0][0]                 
__________________________________________________________________________________________________
conv_dw_13_relu (ReLU)          (None, 7, 7, 1024)   0           conv_dw_13_bn[0][0]              
__________________________________________________________________________________________________
conv_pw_13 (Conv2D)             (None, 7, 7, 1024)   1048576     conv_dw_13_relu[0][0]            
__________________________________________________________________________________________________
conv_pw_13_bn (BatchNormalizati (None, 7, 7, 1024)   4096        conv_pw_13[0][0]                 
__________________________________________________________________________________________________
conv_pw_13_relu (ReLU)          (None, 7, 7, 1024)   0           conv_pw_13_bn[0][0]              
__________________________________________________________________________________________________
up_sampling2d (UpSampling2D)    (None, 14, 14, 1024) 0           conv_pw_13_relu[0][0]            
__________________________________________________________________________________________________
concatenate (Concatenate)       (None, 14, 14, 1536) 0           up_sampling2d[0][0]              
                                                                 conv_pw_11_relu[0][0]            
__________________________________________________________________________________________________
Conv_6_1_conv (Conv2D)          (None, 14, 14, 256)  3539200     concatenate[0][0]                
__________________________________________________________________________________________________
Conv_6_1BatchNormalization (Bat (None, 14, 14, 256)  1024        Conv_6_1_conv[0][0]              
__________________________________________________________________________________________________
Conv_6_1ActivationLayer (Activa (None, 14, 14, 256)  0           Conv_6_1BatchNormalization[0][0] 
__________________________________________________________________________________________________
Conv_6_2_conv (Conv2D)          (None, 14, 14, 256)  590080      Conv_6_1ActivationLayer[0][0]    
__________________________________________________________________________________________________
Conv_6_2BatchNormalization (Bat (None, 14, 14, 256)  1024        Conv_6_2_conv[0][0]              
__________________________________________________________________________________________________
Conv_6_2ActivationLayer (Activa (None, 14, 14, 256)  0           Conv_6_2BatchNormalization[0][0] 
__________________________________________________________________________________________________
up_sampling2d_1 (UpSampling2D)  (None, 28, 28, 256)  0           Conv_6_2ActivationLayer[0][0]    
__________________________________________________________________________________________________
concatenate_1 (Concatenate)     (None, 28, 28, 512)  0           up_sampling2d_1[0][0]            
                                                                 conv_pw_5_relu[0][0]             
__________________________________________________________________________________________________
Conv_7_1_conv (Conv2D)          (None, 28, 28, 256)  1179904     concatenate_1[0][0]              
__________________________________________________________________________________________________
Conv_7_1BatchNormalization (Bat (None, 28, 28, 256)  1024        Conv_7_1_conv[0][0]              
__________________________________________________________________________________________________
Conv_7_1ActivationLayer (Activa (None, 28, 28, 256)  0           Conv_7_1BatchNormalization[0][0] 
__________________________________________________________________________________________________
Conv_7_2_conv (Conv2D)          (None, 28, 28, 256)  590080      Conv_7_1ActivationLayer[0][0]    
__________________________________________________________________________________________________
Conv_7_2BatchNormalization (Bat (None, 28, 28, 256)  1024        Conv_7_2_conv[0][0]              
__________________________________________________________________________________________________
Conv_7_2ActivationLayer (Activa (None, 28, 28, 256)  0           Conv_7_2BatchNormalization[0][0] 
__________________________________________________________________________________________________
up_sampling2d_2 (UpSampling2D)  (None, 56, 56, 256)  0           Conv_7_2ActivationLayer[0][0]    
__________________________________________________________________________________________________
concatenate_2 (Concatenate)     (None, 56, 56, 384)  0           up_sampling2d_2[0][0]            
                                                                 conv_pw_3_relu[0][0]             
__________________________________________________________________________________________________
Conv_8_1_conv (Conv2D)          (None, 56, 56, 192)  663744      concatenate_2[0][0]              
__________________________________________________________________________________________________
Conv_8_1BatchNormalization (Bat (None, 56, 56, 192)  768         Conv_8_1_conv[0][0]              
__________________________________________________________________________________________________
Conv_8_1ActivationLayer (Activa (None, 56, 56, 192)  0           Conv_8_1BatchNormalization[0][0] 
__________________________________________________________________________________________________
Conv_8_2_conv (Conv2D)          (None, 56, 56, 128)  221312      Conv_8_1ActivationLayer[0][0]    
__________________________________________________________________________________________________
Conv_8_2BatchNormalization (Bat (None, 56, 56, 128)  512         Conv_8_2_conv[0][0]              
__________________________________________________________________________________________________
Conv_8_2ActivationLayer (Activa (None, 56, 56, 128)  0           Conv_8_2BatchNormalization[0][0] 
__________________________________________________________________________________________________
up_sampling2d_3 (UpSampling2D)  (None, 112, 112, 128 0           Conv_8_2ActivationLayer[0][0]    
__________________________________________________________________________________________________
concatenate_3 (Concatenate)     (None, 112, 112, 192 0           up_sampling2d_3[0][0]            
                                                                 conv_pw_1_relu[0][0]             
__________________________________________________________________________________________________
Conv_9_1_conv (Conv2D)          (None, 112, 112, 96) 165984      concatenate_3[0][0]              
__________________________________________________________________________________________________
Conv_9_1BatchNormalization (Bat (None, 112, 112, 96) 384         Conv_9_1_conv[0][0]              
__________________________________________________________________________________________________
Conv_9_1ActivationLayer (Activa (None, 112, 112, 96) 0           Conv_9_1BatchNormalization[0][0] 
__________________________________________________________________________________________________
Conv_9_2_conv (Conv2D)          (None, 112, 112, 64) 55360       Conv_9_1ActivationLayer[0][0]    
__________________________________________________________________________________________________
Conv_9_2BatchNormalization (Bat (None, 112, 112, 64) 256         Conv_9_2_conv[0][0]              
__________________________________________________________________________________________________
Conv_9_2ActivationLayer (Activa (None, 112, 112, 64) 0           Conv_9_2BatchNormalization[0][0] 
__________________________________________________________________________________________________
up_sampling2d_4 (UpSampling2D)  (None, 224, 224, 64) 0           Conv_9_2ActivationLayer[0][0]    
__________________________________________________________________________________________________
concatenate_4 (Concatenate)     (None, 224, 224, 67) 0           up_sampling2d_4[0][0]            
                                                                 input_1[0][0]                    
__________________________________________________________________________________________________
Conv_10_1_conv (Conv2D)         (None, 224, 224, 48) 28992       concatenate_4[0][0]              
__________________________________________________________________________________________________
Conv_10_1BatchNormalization (Ba (None, 224, 224, 48) 192         Conv_10_1_conv[0][0]             
__________________________________________________________________________________________________
Conv_10_1ActivationLayer (Activ (None, 224, 224, 48) 0           Conv_10_1BatchNormalization[0][0]
__________________________________________________________________________________________________
Conv_10_2_conv (Conv2D)         (None, 224, 224, 32) 13856       Conv_10_1ActivationLayer[0][0]   
__________________________________________________________________________________________________
Conv_10_2BatchNormalization (Ba (None, 224, 224, 32) 128         Conv_10_2_conv[0][0]             
__________________________________________________________________________________________________
Conv_10_2ActivationLayer (Activ (None, 224, 224, 32) 0           Conv_10_2BatchNormalization[0][0]
__________________________________________________________________________________________________
spatial_dropout2d (SpatialDropo (None, 224, 224, 32) 0           Conv_10_2ActivationLayer[0][0]   
__________________________________________________________________________________________________
conv2d (Conv2D)                 (None, 224, 224, 1)  33          spatial_dropout2d[0][0]          
__________________________________________________________________________________________________
reshape (Reshape)               (None, 224, 224)     0           conv2d[0][0]                     
==================================================================================================
Total params: 10,283,745
Trainable params: 10,258,689
Non-trainable params: 25,056
__________________________________________________________________________________________________

Define dice coefficient function

  • Create a function to calculate dice coefficient
In [0]:
def dice_coefficient(y_true, y_pred):
    numerator = 2 * tf.reduce_sum(y_true * y_pred)
    denominator = tf.reduce_sum(y_true + y_pred)

    return numerator / (denominator + tf.keras.backend.epsilon())

Define loss

In [0]:
def loss(y_true, y_pred):
    return binary_crossentropy(y_true, y_pred) - log(dice_coefficient(y_true, y_pred) + epsilon())

Compile the model

  • Complie the model using below parameters
    • loss: use the loss function defined above
    • optimizers: use Adam optimizer
    • metrics: use dice_coefficient function defined above
In [0]:
adam = Adam(lr = 1e-4, beta_1 = 0.9, beta_2 = 0.999, epsilon = None, decay = 0.0, amsgrad = False)
model.compile(loss = loss, optimizer = adam, metrics = [dice_coefficient])

Define checkpoint and earlystopping

In [18]:
checkpoint = ModelCheckpoint('model_{loss:.2f}.h5', monitor = 'loss', verbose = 1, save_best_only = True, save_weights_only = True, mode = 'min', period = 1)
stop = EarlyStopping(monitor = 'loss', patience = 5, mode = 'min')
reduce_lr = ReduceLROnPlateau(monitor = 'loss', factor = 0.2, patience = 5, min_lr = 1e-6, verbose = 1, mode = 'min')
WARNING:tensorflow:`period` argument is deprecated. Please use `save_freq` to specify the frequency in number of samples seen.

Fit the model

  • Fit the model using below parameters
    • epochs: you can decide
    • batch_size: 1
    • callbacks: checkpoint, reduce_lr, stop
In [19]:
X_train, X_valid, y_train, y_valid = train_test_split(X, masks, test_size = 0.15, random_state = 2019, shuffle = False)
X_train.shape, X_valid.shape, y_train.shape, y_valid.shape
Out[19]:
((347, 224, 224, 3), (62, 224, 224, 3), (347, 224, 224), (62, 224, 224))
In [20]:
model.fit(X_train, y_train, epochs = 30, batch_size = 1, callbacks = [checkpoint, reduce_lr, stop], validation_data = (X_valid, y_valid))
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/ops/math_grad.py:1424: where (from tensorflow.python.ops.array_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.where in 2.0, which has the same broadcast rule as np.where
Train on 347 samples, validate on 62 samples
Epoch 1/30
346/347 [============================>.] - ETA: 0s - loss: 1.4227 - dice_coefficient: 0.4146
Epoch 00001: loss improved from inf to 1.42042, saving model to model_1.42.h5
347/347 [==============================] - 50s 143ms/sample - loss: 1.4204 - dice_coefficient: 0.4154 - val_loss: 1.2301 - val_dice_coefficient: 0.4663
Epoch 2/30
346/347 [============================>.] - ETA: 0s - loss: 0.9990 - dice_coefficient: 0.5275
Epoch 00002: loss improved from 1.42042 to 0.99820, saving model to model_1.00.h5
347/347 [==============================] - 36s 103ms/sample - loss: 0.9982 - dice_coefficient: 0.5277 - val_loss: 1.0558 - val_dice_coefficient: 0.4946
Epoch 3/30
346/347 [============================>.] - ETA: 0s - loss: 0.7713 - dice_coefficient: 0.6082
Epoch 00003: loss improved from 0.99820 to 0.77224, saving model to model_0.77.h5
347/347 [==============================] - 36s 104ms/sample - loss: 0.7722 - dice_coefficient: 0.6076 - val_loss: 0.9375 - val_dice_coefficient: 0.5343
Epoch 4/30
346/347 [============================>.] - ETA: 0s - loss: 0.6462 - dice_coefficient: 0.6598
Epoch 00004: loss improved from 0.77224 to 0.64609, saving model to model_0.65.h5
347/347 [==============================] - 36s 104ms/sample - loss: 0.6461 - dice_coefficient: 0.6596 - val_loss: 0.8729 - val_dice_coefficient: 0.5476
Epoch 5/30
346/347 [============================>.] - ETA: 0s - loss: 0.5229 - dice_coefficient: 0.7146
Epoch 00005: loss improved from 0.64609 to 0.52299, saving model to model_0.52.h5
347/347 [==============================] - 36s 104ms/sample - loss: 0.5230 - dice_coefficient: 0.7142 - val_loss: 1.0925 - val_dice_coefficient: 0.4922
Epoch 6/30
346/347 [============================>.] - ETA: 0s - loss: 0.4624 - dice_coefficient: 0.7450
Epoch 00006: loss improved from 0.52299 to 0.46156, saving model to model_0.46.h5
347/347 [==============================] - 36s 104ms/sample - loss: 0.4616 - dice_coefficient: 0.7454 - val_loss: 0.9269 - val_dice_coefficient: 0.5560
Epoch 7/30
346/347 [============================>.] - ETA: 0s - loss: 0.3878 - dice_coefficient: 0.7815
Epoch 00007: loss improved from 0.46156 to 0.38700, saving model to model_0.39.h5
347/347 [==============================] - 36s 104ms/sample - loss: 0.3870 - dice_coefficient: 0.7819 - val_loss: 0.8531 - val_dice_coefficient: 0.5754
Epoch 8/30
346/347 [============================>.] - ETA: 0s - loss: 0.3445 - dice_coefficient: 0.8079
Epoch 00008: loss improved from 0.38700 to 0.34439, saving model to model_0.34.h5
347/347 [==============================] - 36s 104ms/sample - loss: 0.3444 - dice_coefficient: 0.8081 - val_loss: 0.9617 - val_dice_coefficient: 0.5548
Epoch 9/30
346/347 [============================>.] - ETA: 0s - loss: 0.3009 - dice_coefficient: 0.8326
Epoch 00009: loss improved from 0.34439 to 0.30138, saving model to model_0.30.h5
347/347 [==============================] - 36s 103ms/sample - loss: 0.3014 - dice_coefficient: 0.8321 - val_loss: 0.8845 - val_dice_coefficient: 0.5864
Epoch 10/30
346/347 [============================>.] - ETA: 0s - loss: 0.2709 - dice_coefficient: 0.8494
Epoch 00010: loss improved from 0.30138 to 0.27063, saving model to model_0.27.h5
347/347 [==============================] - 36s 104ms/sample - loss: 0.2706 - dice_coefficient: 0.8494 - val_loss: 0.8476 - val_dice_coefficient: 0.5897
Epoch 11/30
346/347 [============================>.] - ETA: 0s - loss: 0.2447 - dice_coefficient: 0.8661
Epoch 00011: loss improved from 0.27063 to 0.24431, saving model to model_0.24.h5
347/347 [==============================] - 36s 104ms/sample - loss: 0.2443 - dice_coefficient: 0.8663 - val_loss: 0.8330 - val_dice_coefficient: 0.6007
Epoch 12/30
346/347 [============================>.] - ETA: 0s - loss: 0.2286 - dice_coefficient: 0.8766
Epoch 00012: loss improved from 0.24431 to 0.22864, saving model to model_0.23.h5
347/347 [==============================] - 36s 104ms/sample - loss: 0.2286 - dice_coefficient: 0.8766 - val_loss: 1.1201 - val_dice_coefficient: 0.5112
Epoch 13/30
346/347 [============================>.] - ETA: 0s - loss: 0.2226 - dice_coefficient: 0.8814
Epoch 00013: loss improved from 0.22864 to 0.22223, saving model to model_0.22.h5
347/347 [==============================] - 36s 103ms/sample - loss: 0.2222 - dice_coefficient: 0.8816 - val_loss: 0.8651 - val_dice_coefficient: 0.5966
Epoch 14/30
346/347 [============================>.] - ETA: 0s - loss: 0.1924 - dice_coefficient: 0.8987
Epoch 00014: loss improved from 0.22223 to 0.19210, saving model to model_0.19.h5
347/347 [==============================] - 36s 103ms/sample - loss: 0.1921 - dice_coefficient: 0.8987 - val_loss: 1.0170 - val_dice_coefficient: 0.5570
Epoch 15/30
346/347 [============================>.] - ETA: 0s - loss: 0.1773 - dice_coefficient: 0.9083
Epoch 00015: loss improved from 0.19210 to 0.17705, saving model to model_0.18.h5
347/347 [==============================] - 36s 103ms/sample - loss: 0.1770 - dice_coefficient: 0.9084 - val_loss: 0.9733 - val_dice_coefficient: 0.5764
Epoch 16/30
346/347 [============================>.] - ETA: 0s - loss: 0.1688 - dice_coefficient: 0.9145
Epoch 00016: loss improved from 0.17705 to 0.16861, saving model to model_0.17.h5
347/347 [==============================] - 36s 103ms/sample - loss: 0.1686 - dice_coefficient: 0.9146 - val_loss: 0.9024 - val_dice_coefficient: 0.6044
Epoch 17/30
346/347 [============================>.] - ETA: 0s - loss: 0.1627 - dice_coefficient: 0.9183
Epoch 00017: loss improved from 0.16861 to 0.16269, saving model to model_0.16.h5
347/347 [==============================] - 36s 103ms/sample - loss: 0.1627 - dice_coefficient: 0.9183 - val_loss: 1.0748 - val_dice_coefficient: 0.5745
Epoch 18/30
346/347 [============================>.] - ETA: 0s - loss: 0.1529 - dice_coefficient: 0.9251
Epoch 00018: loss improved from 0.16269 to 0.15280, saving model to model_0.15.h5
347/347 [==============================] - 36s 104ms/sample - loss: 0.1528 - dice_coefficient: 0.9251 - val_loss: 0.9182 - val_dice_coefficient: 0.6069
Epoch 19/30
346/347 [============================>.] - ETA: 0s - loss: 0.1478 - dice_coefficient: 0.9290
Epoch 00019: loss improved from 0.15280 to 0.14760, saving model to model_0.15.h5
347/347 [==============================] - 37s 106ms/sample - loss: 0.1476 - dice_coefficient: 0.9291 - val_loss: 0.9509 - val_dice_coefficient: 0.5937
Epoch 20/30
346/347 [============================>.] - ETA: 0s - loss: 0.1438 - dice_coefficient: 0.9318
Epoch 00020: loss improved from 0.14760 to 0.14351, saving model to model_0.14.h5
347/347 [==============================] - 36s 104ms/sample - loss: 0.1435 - dice_coefficient: 0.9320 - val_loss: 1.1096 - val_dice_coefficient: 0.5464
Epoch 21/30
346/347 [============================>.] - ETA: 0s - loss: 0.1469 - dice_coefficient: 0.9314
Epoch 00021: loss did not improve from 0.14351
347/347 [==============================] - 36s 103ms/sample - loss: 0.1467 - dice_coefficient: 0.9314 - val_loss: 1.0146 - val_dice_coefficient: 0.5963
Epoch 22/30
346/347 [============================>.] - ETA: 0s - loss: 0.1546 - dice_coefficient: 0.9277
Epoch 00022: loss did not improve from 0.14351
347/347 [==============================] - 36s 103ms/sample - loss: 0.1543 - dice_coefficient: 0.9277 - val_loss: 1.8896 - val_dice_coefficient: 0.4636
Epoch 23/30
346/347 [============================>.] - ETA: 0s - loss: 0.1376 - dice_coefficient: 0.9370
Epoch 00023: loss improved from 0.14351 to 0.13752, saving model to model_0.14.h5
347/347 [==============================] - 37s 106ms/sample - loss: 0.1375 - dice_coefficient: 0.9370 - val_loss: 1.2246 - val_dice_coefficient: 0.5704
Epoch 24/30
346/347 [============================>.] - ETA: 0s - loss: 0.1272 - dice_coefficient: 0.9429
Epoch 00024: loss improved from 0.13752 to 0.12716, saving model to model_0.13.h5
347/347 [==============================] - 36s 104ms/sample - loss: 0.1272 - dice_coefficient: 0.9430 - val_loss: 1.1509 - val_dice_coefficient: 0.5773
Epoch 25/30
346/347 [============================>.] - ETA: 0s - loss: 0.1179 - dice_coefficient: 0.9488
Epoch 00025: loss improved from 0.12716 to 0.11772, saving model to model_0.12.h5
347/347 [==============================] - 36s 104ms/sample - loss: 0.1177 - dice_coefficient: 0.9488 - val_loss: 1.0385 - val_dice_coefficient: 0.5940
Epoch 26/30
346/347 [============================>.] - ETA: 0s - loss: 0.1127 - dice_coefficient: 0.9525
Epoch 00026: loss improved from 0.11772 to 0.11242, saving model to model_0.11.h5
347/347 [==============================] - 36s 104ms/sample - loss: 0.1124 - dice_coefficient: 0.9526 - val_loss: 1.0285 - val_dice_coefficient: 0.6008
Epoch 27/30
346/347 [============================>.] - ETA: 0s - loss: 0.1123 - dice_coefficient: 0.9521
Epoch 00027: loss improved from 0.11242 to 0.11222, saving model to model_0.11.h5
347/347 [==============================] - 37s 106ms/sample - loss: 0.1122 - dice_coefficient: 0.9521 - val_loss: 1.1334 - val_dice_coefficient: 0.5895
Epoch 28/30
346/347 [============================>.] - ETA: 0s - loss: 0.1112 - dice_coefficient: 0.9533
Epoch 00028: loss improved from 0.11222 to 0.11099, saving model to model_0.11.h5
347/347 [==============================] - 37s 106ms/sample - loss: 0.1110 - dice_coefficient: 0.9533 - val_loss: 1.1478 - val_dice_coefficient: 0.5753
Epoch 29/30
346/347 [============================>.] - ETA: 0s - loss: 0.1074 - dice_coefficient: 0.9558
Epoch 00029: loss improved from 0.11099 to 0.10720, saving model to model_0.11.h5
347/347 [==============================] - 37s 106ms/sample - loss: 0.1072 - dice_coefficient: 0.9559 - val_loss: 1.0682 - val_dice_coefficient: 0.5926
Epoch 30/30
346/347 [============================>.] - ETA: 0s - loss: 0.1110 - dice_coefficient: 0.9549
Epoch 00030: loss did not improve from 0.10720
347/347 [==============================] - 36s 103ms/sample - loss: 0.1108 - dice_coefficient: 0.9550 - val_loss: 1.1523 - val_dice_coefficient: 0.5694
Out[20]:
<tensorflow.python.keras.callbacks.History at 0x7fe2ce2f2940>
In [21]:
model.evaluate(X_valid, y_valid, verbose = 1)
62/62 [==============================] - 8s 128ms/sample - loss: 0.9379 - dice_coefficient: 0.6261
Out[21]:
[0.9378836558711144, 0.6260623]

Get the predicted mask for a sample image

In [22]:
# Load previous model weight
WEIGHTS_FILE = "model_0.11.h5"
learned_model = create_model()
learned_model.load_weights(WEIGHTS_FILE)
y_pred = learned_model.predict(X_valid, verbose = 1)
62/62 [==============================] - 2s 29ms/sample
In [23]:
# For a sample image
n = 16
image = cv2.resize(X_valid[n], dsize = (IMAGE_HEIGHT, IMAGE_WIDTH), interpolation = cv2.INTER_CUBIC)
pred_mask = cv2.resize(1.0*(y_pred[n] > 0.1), (IMAGE_WIDTH, IMAGE_HEIGHT))

image2 = image
image2[:,:,0] = pred_mask*image[:,:,0]
image2[:,:,1] = pred_mask*image[:,:,1]
image2[:,:,2] = pred_mask*image[:,:,2]
out_image = image2

fig = plt.figure(figsize = (15, 7.2))
ax = fig.add_subplot(1, 1, 1)
plt.axis('off')
plt.imshow(out_image)
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Out[23]:
<matplotlib.image.AxesImage at 0x7fe27481b0f0>
In [24]:
fig = plt.figure(figsize = (15, 7.2))
ax = fig.add_subplot(1, 1, 1)
plt.axis('off')
plt.imshow(pred_mask, alpha = 1)
Out[24]:
<matplotlib.image.AxesImage at 0x7fe27482bb70>

Impose the mask on the image

In [25]:
fig = plt.figure(figsize = (15, 7.2))
ax = fig.add_subplot(1, 1, 1)
plt.axis('off')
plt.imshow(X_valid[n])
plt.savefig('image.jpg', bbox_inches = 'tight', pad_inches = 0)
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
In [26]:
fig = plt.figure(figsize = (15, 7.2))
ax = fig.add_subplot(1, 1, 1)
plt.axis('off')
plt.imshow(y_pred[n], alpha = 0.8)
plt.savefig('mask.jpg', bbox_inches = 'tight', pad_inches = 0)
In [27]:
from google.colab.patches import cv2_imshow
img = cv2.imread('image.jpg', 1)
mask = cv2.imread('mask.jpg', 1)
img = cv2.add(img, mask)
cv2_imshow(img)

Conclusion

Project was all about how we can make use of a pretrained MobileNet (Transfer Learning) and on top of it add all the UNET layers to train, fit and evaluate model with an objective to predict the boundaries(mask) around the face in a given image.

  • Model was complied using binary cross entropy as loss, adam optimizer and dice coefficient as metrics.
  • Model checkpoint, early stopping and learning rate reducers were used as callbacks.
  • Data was split into train and validation using 85/15 ratio. Best loss I got is 0.10 and dice coeff of 0.95 on the training data.
  • Model weights for this were used and then used to predict on validation data to get mask.
  • Further checked on sample image and imposed mask on the image.
  • As seen in the above images, it can be seen that model does a very good job in predicting the masks.