Learn Transfer Learning and Face Detection within 9 Steps!
Face_detection

Download the dataset from here

so that the images are in a directory named ‘faces/’. This dataset was actually generated by applying excellent dlib’s poseestimationon a few images from imagenet tagged as ‘face’.

9 steps to implement face landmark detection with pytorch and transfer learning

In [137]:
from __future__ import print_function, division
import os
import torch
import pandas as pd
from skimage import io, transform
import numpy as np
import matplotlib.pyplot as plt
from torch.utils.data import Dataset, DataLoader
from torchvision import transforms, utils

# Ignore warnings
import warnings
warnings.filterwarnings("ignore")

plt.ion()   # interactive mode
In [138]:
landmarks_frame = pd.read_csv('../faces/face_landmarks.csv')
In [139]:
len(landmarks_frame)
Out[139]:
69

Step1: Load landmarks

In [140]:
n = 65

img_name = landmarks_frame.iloc[n,0]                     # Find the 65+2 th image 找到第65 + 2个图片
landmarks = landmarks_frame.iloc[n,1:].as_matrix()       # get all landmarks of that image 取得该图片的所有landmarks
landmarks = landmarks.astype('float').reshape(-1,2)      # divide landmarks to 2 columes 将landmarks 分为两列。-1表示不确定有多少行,但是确定要分为两列

print('Image name: {}'.format(img_name))
print('Landmarks shape: {}'.format(landmarks.shape))
print('First 4 Landmarks: {}'.format(landmarks[:4]))     # Get first 4 landmarks 取出前4行
Image name: person-7.jpg
Landmarks shape: (68, 2)
First 4 Landmarks: [[32. 65.]
 [33. 76.]
 [34. 86.]
 [34. 97.]]
In [141]:
def show_landmarks(image, landmarks):
    """Show image with landmarks"""
    plt.imshow(image)
    plt.scatter(landmarks[:, 0], landmarks[:, 1], s=10, marker='^', c='r') # mark first colume as x, second colume as y,
                                                                           # 取第0列的所有点作为x,取第1列的所有点作为y

    #plt.plt.scatter(landmarks[0, 0], landmarks[0, 1], s=10, marker='_', c='r')  # Uncommet this lines will only show one landmark point in the image 解开注释则只标记一个点
    plt.pause(0.001)  # pause a bit so that plots are updated

plt.figure()
show_landmarks(io.imread(os.path.join('../faces/', img_name)),
               landmarks)
plt.show()

torch.utils.data.Dataset is an abstract class representing a dataset. Your custom dataset should inherit Dataset and override the following methods:

  • __ len__ so that len(dataset) returns the size of the dataset.
  • __ getitem__ to support the indexing such that dataset[i] can be used to get i

One Sample of our dataset will be a dict {'image': image, 'landmarks': landmarks}

Step2: Construct Dataset

In [142]:
class FaceLandmarksDataset(Dataset):
    def __init__(self, csv_file, root_dir, transform = None):
        self.landmarks_frame = pd.read_csv(csv_file)
        self.root_dir = root_dir
        self.transform = transform

    def __len__(self):
        return len(landmarks_frame)
    def __getitem__(self, idx):
        img_name = self.landmarks_frame.iloc[idx, 0]
        img_name = os.path.join(self.root_dir, img_name)
        image = io.imread(img_name)                          # Use skimage.io.imread to read iamge, then convert to ndarray
                                                                #使用skimage.io.imread读取图片,将图片以ndarry表示
        landmarks = self.landmarks_frame.iloc[idx, 1:].as_matrix()   #Convert dataframe to matrix, then find the idx row, get all columes except the first colume
                                                                     # 将dataframe转成matrix,找到第idx行,第一列之后的所有
        landmarks = landmarks.astype('float').reshape(-1,2)
        sample = {'image': image, 'landmarks': landmarks}

        if self.transform:
            sample = self.transform(sample)
        return sample
In [143]:
face_dataset = FaceLandmarksDataset('../faces/face_landmarks.csv', '../faces')
fig = plt.figure()

for i in range(len(face_dataset)):
    sample = face_dataset[i]

    print(i,sample['image'].shape, sample['landmarks'].shape)

    ax = plt.subplot(1,4,i+1)
    plt.tight_layout()
    ax.set_title('Sample #{}'.format(i))
    ax.axis('off')
    show_landmarks(sample['image'],sample['landmarks'])
    if i == 3:
        plt.show()
        break
0 (324, 215, 3) (68, 2)
1 (500, 333, 3) (68, 2)
2 (250, 258, 3) (68, 2)
3 (434, 290, 3) (68, 2)

Step3: Data Augmentation

  • Rescale: to scale the image, but keep ratio of that image
  • RandomCrop: to crop from image randomly. This is data augmentation.
  • ToTensor: to convert the numpy images to torch images (we need to swap axes).

We will write them as callable classes instead of simple functions so that parameters of the transform need not be passed everytime it’s called. For this, we just need to implement __call__ method and if required, __init__ method. We can then use a transform like this:

tsfm = Transform(params)

transformed_sample = tsfm(sample)

About __call__, see this link: https://blog.csdn.net/Yaokai_AssultMaster/article/details/70256621

__init__ function to construct a instance, __call__ function make this instance callable

In [144]:
class Rescale(object):

    """
       Args:
        output_size (tuple or int): Desired output size. If tuple, output is
            matched to output_size. If int, smaller of image edges is matched
            to output_size keeping aspect ratio the same.
    """
    def __init__(self, output_size):
        assert isinstance(output_size,(int, tuple))
        self.output_size = output_size

    def __call__(self, sample):
        image, landmarks = sample['image'], sample['landmarks']

        h, w = image.shape[:2]

        if isinstance(self.output_size, int):
            if h>w:
                new_h, new_w = self.output_size * h / w, self.output_size
            else:
                new_h, new_w = self.output_size, self.output_size * w / h
        else:
            new_h, new_w = self.output_size

        new_h, new_w = int(new_h), int(new_w)

        img = transform.resize(image, (new_h, new_w))
        # h and w are swapped for landmarks because for images,
        # x and y axes are axis 1 and 0 respectively

        landmarks = landmarks * [new_w/w, new_h/h]  ## h is height, w is width, so x * w, y * h

        return {'image': img, 'landmarks': landmarks}
In [145]:
scale = Rescale(256)   ### input is int
sample = face_dataset[1]
rescaled_sample = scale(sample)
print("Before image shape is {}".format(sample['image'].shape))
print("After Rescaled, image shape is {}".format(rescaled_sample['image'].shape))
show_landmarks(rescaled_sample['image'],rescaled_sample['landmarks'])
Before image shape is (500, 333, 3)
After Rescaled, image shape is (384, 256, 3)
In [146]:
scale = Rescale((256,256))   ### input is tuple
sample = face_dataset[1]
rescaled_sample = scale(sample)
print("Before image shape is {}".format(sample['image'].shape))
print("After Rescaled, image shape is {}".format(rescaled_sample['image'].shape))
show_landmarks(rescaled_sample['image'],rescaled_sample['landmarks'])
Before image shape is (500, 333, 3)
After Rescaled, image shape is (256, 256, 3)
In [147]:
class RandomCrop(object):
    """Crop randomly the image in a sample.

    Args:
        output_size (tuple or int): Desired output size. If int, square crop
            is made.
    """
    def __init__(self, output_size):
        assert isinstance(output_size, (int, tuple))
        if isinstance(output_size, int):
            self.output_size = (output_size, output_size)
        else:
            assert len(output_size) == 2
            self.output_size = output_size

    def __call__(self, sample):
        image, landmarks = sample['image'], sample['landmarks']

        h, w = image.shape[:2]

        new_h, new_w = self.output_size

        top = np.random.randint(0, h-new_h)
        left = np.random.randint(0, w-new_w)

        image = image[top: top + new_h,
                     left: left + new_w]

        landmarks = landmarks - [left, top]
        return {'image': image, 'landmarks': landmarks}
In [148]:
crop = RandomCrop(224)   ### input is int
sample = face_dataset[1]
croped_sample = crop(sample)
print("Before image shape is {}".format(sample['image'].shape))
print("After Rescaled, image shape is {}".format(croped_sample['image'].shape))
show_landmarks(croped_sample['image'],croped_sample['landmarks'])
Before image shape is (500, 333, 3)
After Rescaled, image shape is (224, 224, 3)

Random corp Directly may loss some part of original image, if we do rescale first, then crop would be better

In [149]:
scale = Rescale((256,256))   ### input is tuple
sample = face_dataset[1]
rescaled_sample = scale(sample)
print("Before image shape is {}".format(sample['image'].shape))
print("After Rescaled, image shape is {}".format(rescaled_sample['image'].shape))
show_landmarks(rescaled_sample['image'],rescaled_sample['landmarks'])


crop = RandomCrop(224)   ### input is int
croped_sample = crop(rescaled_sample)
print("Before image shape is {}".format(rescaled_sample['image'].shape))
print("After crop, image shape is {}".format(croped_sample['image'].shape))
show_landmarks(croped_sample['image'],croped_sample['landmarks'])
Before image shape is (500, 333, 3)
After Rescaled, image shape is (256, 256, 3)
Before image shape is (256, 256, 3)
After crop, image shape is (224, 224, 3)