LEAVE ME ALONE.”
These words send a shiver down my spine. But then again, they are the only comfort I get when I use Snapchat these days.
You’re probably wondering:
“Why is Snapchat scaring this moron?”
Look:
I don’t know about you, BUT I SURE AS HELL don’t enjoy sharing my bed with Casper or any other creepy ghosts that this otherworld-R.S.V.P-app has brought to my life.
You see, every once in a while I’m doing my dog filter faces like a normal human being in 2017; but then… my cat stops moving and stares at the end of the room… the camera refocuses… and then: it finds an invisible Dalmatian filter standing by my side.
I’ve moved twice already… but NO LONGER! You millennial Bloody Mary fetching for likes!
A buddy of mine told me to look into the field of Computer Vision … It’s a crazy field where machines learn how to extract useful information from images as we humans do.
In a simple phrase it’s replicating the process of when you look at someone, and recognize them and then say: “Hey empty void of solitude, how’s it going?” and then get no answer back because I have no one in my life. Or when you are looking for your car in a parking lot, scan the place and find your car at the farthest parking space available.
Whichever one, it’s the same process. Except you can teach this to a machine.
Knowing this, I’ve decided to venture on, and write a 2-part article with everything I’ve learned. I’ll focus on face detection using OpenCV, and in the next, I’ll dive into face recognition.
And it gets better:
I’ll give a short background so we know where we stand, then some theory and do a little coding in OpenCV which is easy to use and learn (and free!)
Finally, decide whether I should stay put and keep on selfy-ing (word TM pending) online or have to move once again.
1. BACKGROUND ON FACE DETECTION
Face Detection has been one of the hottest topics of computer vision for the past few years.
This technology has been available for some years now and is being used all over the place.
From cameras that make sure faces are focused before you take a picture, to Facebook when it tags people automatically once you upload a picture (before you did that manually remember?).
Or some shows like CSI used them to identify “bad guys” from security footage (ENHANCE! – then insert crime pun) or even unlocking your phone by looking at it!
In short, how Face Detection and Face Recognition work when unlocking your phone is as following:
You look at your phone, and it extracts your face from an image (the nerdy name for this process is face detection). Then, it compares the current face with the one it saved before during training and checks if they both match (its nerdy name is face recognition) and, if they do, it unlocks itself.
As you see, this technology not only allows me to be any type of dog I want. People are getting pretty interested in it because of its ample applications. ATMs with Facial Recognition and Face Detection software have been introduced to withdraw money. Also, Emotion Analysis is gaining relevance for research purposes.
So if you’re thinking:
“I wish I could build my own ATM with facial detection or facial recognition.”
Then stop dreaming, you will be able to have a face detector at the end of this article (banking system not included).
This article is divided into four parts:
- I’ll explain the nerdy (and a bit complicated theory) of 2 pre-trained classifiers of OpenCV for face detection.
- I’ll show you the coding process I followed.
- I’ll compare both algorithms to see which is quicker and which is more accurate for face detection.
- I’ll include a Snapchat selfie at the end.
Let’s see if this ghost can keep up with me.
2. THEORY OF FACE DETECTION CLASSIFIERS
A computer program that decides whether an image is a positive image (face image) or negative image (non-face image) is called a classifier. A classifier is trained on hundreds of thousands of face and non-face images to learn how to classify a new image correctly. OpenCV provides us with two pre-trained and ready to be used for face detection classifiers:
- Haar Classifier
- LBP Classifier
Both of these classifiers process images in gray scales, basically because we don't need color information to decide if a picture has a face or not (we'll talk more about this later on). As these are pre-trained in OpenCV, their learned knowledge files also come bundled with OpenCV opencv/data/.
To run a classifier, we need to load the knowledge files first, as if it had no knowledge, just like a newly born baby (stupid babies).
Each file starts with the name of the classifier it belongs to. For example, a Haar cascade classifier starts off as haarcascade_frontalface_alt.xml.
These are the two types of classifiers we will be using to analyze Casper.
2.1 HAAR CLASSIFIER
The Haar Classifier is a machine learning based approach, an algorithm created by Paul Viola and Michael Jones; which (as mentioned before) are trained from many many positive images (with faces) and negatives images (without faces).
Please don’t say:
“But I didn’t learn any computer magic in my minor at uni. How exactly does this work?”
It starts by extracting Haar features from each image as shown by the windows below:
Each window is placed on the picture to calculate a single feature. This feature is a single value obtained by subtracting the sum of pixels under the white part of the window from the sum of the pixels under the black part of the window.
Now, all possible sizes of each window are placed on all possible locations of each image to calculate plenty of features.
For example, in above image, we are extracting two features. The first one focuses on the property that the region of the eyes is often darker than the area of the nose and cheeks. The second feature relies on the property that the eyes are darker than the bridge of the nose.
But among all these features calculated, most of them are irrelevant. For example, when used on the cheek, the windows become irrelevant because none of these areas are darker or lighter than other regions on the cheeks, all sectors here are the same.
So we promptly discard irrelevant features and keep only those relevant with a fancy technique called Adaboost. AdaBoost is a training process for face detection, which selects only those features known to improve the classification (face/non-face) accuracy of our classifier.
In the end, the algorithm considers the fact that generally: most of the region in an image is a non-face region. Considering this, it’s a better idea to have a simple method to check if a window is a non-face region, and if it's not, discard it right away and don’t process it again. So we can focus mostly on the area where a face is.
Now, let’s move on to contender #2.
2.2 LBP CASCADE CLASSIFIER
As any other classifier, the Local Binary Patterns, or LBP in short, also needs to be trained on hundreds of images. LBP is a visual/texture descriptor, and thankfully, our faces are also composed of micro visual patterns.
So, LBP features are extracted to form a feature vector that classifies a face from a non-face.
“But how are LBP features found?”
Each training image is divided into some blocks as shown in the picture below.
For each block, LBP looks at 9 pixels (3×3 window) at a time, and with a particular interest in the pixel located in the center of the window.
Then, it compares the central pixel value with every neighbor's pixel value under the 3×3 window. For each neighbor pixel that is greater than or equal to the center pixel, it sets its value to 1, and for the others, it sets them to 0.
After that, it reads the updated pixel values (which can be either 0 or 1) in a clockwise order and forms a binary number. Next, it converts the binary number into a decimal number, and that decimal number is the new value of the center pixel. We do this for every pixel in a block.
Then, it converts each block values into a histogram, so now we have gotten one histogram for each block in an image, like this:
Finally, it concatenates these block histograms to form a one feature vector for one image, which contains all the features we are interested. So, this is how we extract LBP features from a picture.
Fun, right? No?
So after all this theory, you should be able to know:
Which classifier to use for face detection and when…Right?
Can’t?
Me neither! Let’s compare them both so we can take a decision.
2.3 HAAR VS. LBP. WHICH IS BEST FOR FACE DETECTION?
I recently spoke with Ghostbuster’s Bill Murray in search of what we need to catch a ghost.
Unfortunately, “You need balls and please stop calling,” doesn’t work for us.
So here I stacked them both against each other:
Each OpenCV face detection classifier has its pros and cons, but the major differences are in accuracy and speed.
So, in case more accurate detections are required, Haar classifier is the way to go. This bad boy is more suitable in technology such as security systems or high-end stalking.
But the LBP classifier is faster, therefore, should be used in mobile applications or embedded systems.
Ok, now we’re ready for my favorite part:
Cold. Hard. Coding
3. CODING USING OPENCV AND PYTHON
To hunt down this witch, I’ll start covering Dependencies, then import Libraries, I’ll run code on Haar, deal with trouble and then run LBP.
3.1 DEPENDENCIES
Let's first install the required dependencies to run this code.
- OpenCV 3.2.0 should be installed.
- Python v3.5 should be installed.
- (Optional) Matplotlib 2.0 should be installed if you want to see results in an organized manner as I've shown in this tutorial. But it's completely optional.
Note: If you don't want to install matplotlib, then replace matplotlib code with OpenCV code as shown below:
Instead of:
You can use:
- plt.imshow(img, color_map): This is a matplotlib function used to display an image. It takes two arguments; the first one is the image you want to post and the second is the colormap (gray, RGB) in which the image is in.
- cv2.imshow(window_name, image): This is a cv2 function used to display the image. It also takes two arguments: the first one is the name of the window that will pop-up to show the picture and the second one is the image you want to display.
- cv2.waitKey(): This is a keyboard binding function, which takes one argument: (x) time in milliseconds. The function delays for (x) milliseconds any keyboard event. If (0) is pressed, it waits indefinitely for a keystroke, if any other key is pressed the program continues.
- cv2.destroyAllWindows(): This simply destroys all the windows we created using cv2.imshow(window_name, image)
Please keep these functions in mind, as I will use them in the following code.
3.2 IMPORTING REQUIRED LIBRARIES FROM OPENCV
Let's import the necessary libraries first. Remember, the names of these libraries are self-descriptive so you can put 2 and 2 together.
When you load an image using OpenCV, it loads it into BGR color space by default. To show the colored image using matplotlib we have to convert it to RGB space. The following is a helper function to do exactly that:
cv2.cvtColor is an OpenCV function to convert images to different color spaces. It takes as input an image to transform, and a color space code (like cv2.COLOR_BGR2RGB) and returns the processed image.
Now that we are all setup let's start coding our first face detector: Haar.
3.3 OPENCV CODE 1– HAAR CASCADE CLASSIFIER
Let's start with a simple task, load our input image, convert it to grayscale mode and then display it.
For this, we’ll need to keep at arms reach this very handy function cv2.cvtColor (converting images to grayscale).
This step is necessary because many operations in OpenCV are done in grayscale for performance reasons.
To read/load our image and convert it to grayscale, I used OpenCV’s built in function cv2.imread(img_path) and passed our image path as the input parameter.
To display our image, I’ll use the plt.imshow(img, cmap) function of matplotlib.
Before we can continue with face detection, we have to load our Haar cascade classifier.
OpenCV provides us with a class cv2.CascadeClassifier which takes as input the training file of the (Haar/LBP) classifier we want to load and loads it for us. Easy-breezy-Covergirl.
Since we want to load our favorite (for now): the Haar classifier, its XML training files are stored in the opencv/data/haarcascades/ folder. You can also find them in the data folder of the Github repo I’ll share with you at the end of this article.
Let's load up our classifier:
Now, how do we detect a face from an image using the CascadeClassifier we just loaded?
Well, again OpenCV's CascadedClassifier has made it simple for us as it comes with the function detectMultiScale, which detects exactly that. Next are some details of its options/arguments:
- detectMultiScale(image, scaleFactor, minNeighbors): This is a general function to detect objects, in this case, it'll detect faces since we called in the face cascade. If it finds a face, it returns a list of positions of said face in the form “Rect(x,y,w,h).”, if not, then returns “None”.
- Image: The first input is the grayscale image. So make sure the image is in grayscale.
- scaleFactor: This function compensates a false perception in size that occurs when one face appears to be bigger than the other simply because it is closer to the camera.
- minNeighbors: This is a detection algorithm that uses a moving window to detect objects, it does so by defining how many objects are found near the current one before it can declare the face found.
There are other parameters as well, and you can review the full details of these functions here. These parameters need to be tuned according to your own data.
Now that we know a straightforward way to detect faces, we could now find the face in our test image.
Then, let's find one!
The following code will try to detect a face from the image and, if detected, it will print the number of faces that it has found, which in our case should be 1. Only 1 since no other spiritual being is out there. Right?…
Right?
Woohoo! We found our face! And it’s only one beautiful face at a time…
Next, let's loop over the list of faces (rectangles) it returned and drew those rectangles using yet another built-in OpenCV rectangle function on our original colored image to see if it found the right faces:
Let’s display the original image to see the rectangles we just drew and verify that detected faces are real ones and not any false positives.
Unfortunately, this code is more scattered than my attention span during high school. Let’s turn this into a function that is completely reusable.
3.4 GROUPING OPENCV CODE INTO A FUNCTION
So, let’s recap a little to get this asshole…
Our detect_faces function takes 3 input parameters:
The first is our loaded CascadeClassifier, the second is the image we want to detect faces on, and the third is the scaleFactor.
Inside the function, first I made a copy img_copy of the passed image. This way we'll do all of the operations on a copy and not the original image. Then, I converted the copied image img_copy to grayscale, as our face detector expects a grayscale image.
After that, I called the detectMultiScale function of our CascadeClassifier to return the list of detected faces (which is the list of rectangles Rect(x, y, w, h)).
Once we have the list of recognized faces, I loop over them and draw a rectangle on the copy of the image. In the end, I return the modified copy of the picture.
So, that code is pretty much the same as before; it’s just grouped inside a function for reusability.
Now let's try this function on another test image:
3.5 DEALING WITH FALSE POSITIVES
Let's try our face detector on another test image:
Before I start packing everything I own to move to exactly-anywhere-but-here, let me check this once again. What could’ve gone wrong?
Oh ok, some faces may be closer to the camera, and they would appear bigger than those faces in the back, this was scaring the bejeezus out of me.
A simple tweak to the scale factor compensates for this so can move that parameter around. For example, scaleFactor=1.2 improved the results.
Please, remember to tune these parameters according to the information you have about your data.
3.6 OPENCV CODE 2 – LBP CASCADE CLASSIFIER
XML training files for LBP cascade are stored in the opencv/data/lbpcascades/ folder, and their names start off with, as you might have guessed, lbpcascade.
From a coding perspective, you don't have to change anything in our face detection code except, instead of loading the Haar classifier training file you have to load the LBP training file, and rest of the system stays the same.
Well, that is damn convenient. Thanks again, OpenCV.
I can also try as many detectors (eye detector, smile detector, etc.) as you want, without changing much of the code (leave a comment if you’d like to build one of those… Ghosts: No comments allowed!)
As you can see, since the code is exactly same, I just loaded up our CascadeClassifier this time with the LBP training file. I read a test image and called our detect_faces function, which returned an image with a face drawn on it.
I’ll give another test image a try as well:
4. HAAR VS. LBP RESULTS ANALYSIS
After dealing with this ghosts thing, I know first-hand that you can't trust anybody these days. So, perhaps you're not entirely convinced with the chart I showed above that compared Haar vs. LBP classifier.
No big deal! I’ll run both Haar and LBP on two test images to see accuracy and time delay of each.
I loaded both Haar and LBP classifiers and two test images: test1 and test2.
4.1 TEST 1
I’ll try both classifiers on test1 image:
I have used Python library function time.time() to keep track of time. So before I start finding faces on our test image, I'll note the start time t1, and then I call our function detect_faces. Then, I'll establish end time t2. The difference between start time t1 and end time t2, dt1, is what we are interested in as it ‘s the time taken by our face detector to detect faces.
Let's do the same for LBP classifier.
Now that we have the face detected images and measured the time both of our face detectors took, let's see how they stack up against each other!
I am going to use matplotlib function subplots(rows, cols, figsize) to display the results side by side. First, the time difference as a title of each subplot (image window).
- Accuracy: Both Haar and LBP detected faces successfully.
- Speed: LBP was faster than Haar.
4.2 TEST 2
Let's see the results for test2 image. The code is the same as for test1:
- Accuracy: Haar detected more faces and than LBP.
- Speed: LBP was significantly faster than Haar.
5. TECHNICAL END NOTES
As you can see, LBP is significantly faster than Haar and not that much behind in accuracy. Depending on your application, use any of the face detection algorithms in Python that we just learned.
6. PERSONAL LIFE NOTES
I feel ready.
Almost ready…
Actually: Kinda ready to figure out who or what has been trying to get followers through my pics.
I’m halfway there, so don’t miss out on how this crazy adventure unfolds.
Little miss-creep-a-lot better think again before she (or he, it’s 2017) pops in any of my selfies ever again.
You can download the complete code we used in this face detection tutorial from this repo along with test images and LBP and Haar training files.
Let’s all thank OpenCV for allowing the implementation of the above-mentioned algorithms and making our life so much easier.
No comments:
Post a Comment