View on GitHub

facial_recognition

Download this project as a .zip file Download this project as a tar.gz file

About

In this project, i have built a facial recognition system using a convolutional neural network. I have used the VGG-16 Net architecture and this has been implemented using keras library in python.

In this page, i have described the implementation details and references used. If you want to just know how to use this project and not the details, please refer to project’s readme.

Project overview

Facial recognition problem is approached using the following steps:

  1. Detect faces in an image - image may contain background which is not needed for face recognition. hence we need to crop out face. for this, I have used OpenCV’s frontal face haar cascade.
  2. Calculate unknown face encoding - this is the heart of the project. details are discussed in next section.
  3. Comparing the face encodings - compare the unknown face encoding with all the known encodings and return the name of the most similar face encoding.

Encoding Faces

What we need is a way to extract a few basic measurements from each face. Then we could measure our unknown face the same way and find the known face with the closest measurements. For example, we might measure the size of each ear, the spacing between the eyes, the length of the nose, etc. But we do not know which of the measurements exactly encode a face. Hence we use Convolutional Neural Network which can automatically learn the encoding from the given image.

For this I have followed the approach as suggested in this paper:

Deep face recognition, O. M. Parkhi and A. Vedaldi and A. Zisserman, Proceedings of the British Machine Vision Conference (BMVC), 2015 paper.

Here is the brief discussion of the method proposed in the above paper.
The basic objective is to obtain a network which outputs similar feature vectors for faces of the same person.Based on this objective function the authors have proposed to train the Convolutional network with the help of training set that contains three face images at a time. However, two of the three images is of the same person where as third image is of a different person. In order to train the network to give similar encoding for the first two images and a different one for the third image, they have proposed a triplet loss function. This loss function aims to minimize the distance between first two images and maximize distance from the third image.

VGG-16 Net

The convnet architecture used here is VGG-16 as shown below:

image source

Implementation

Code

  • The entire project is implemented in Python and available here.
  • the images to be recognized later must be placed in the known folder.
  • the code to generate vectors of the known folder is present in vgg.py.
  • the code to compare unknown face with known is present in compare.py.
  • Image pre-processing and code to crop out the faces is implemented in both files hence any image containing one face can be passed.

Usage

Information on code usage and requirements is avaliable in project readme.

References

Apart from the paper mentioned above i have used the following resources: