Haar Cascades vs. LBP Cascades in Face Detection

An LBP cascade can be trained to perform similarly (or better) than the Haar cascade, but out of the box, the Haar cascade is about 3x slower, and depending on your data, about 1-2% better at accurately detecting the location of a face. This increase in accuracy is quite significant given that face detection can operate in the 95%+ accuracy range.

Below are some results when using the MUCT dataset.

A correct detection is noted when there is at least a 50% overlap between the ground-truth and OpenCV detected coordinates.

Cascade:haarcascade_frontalface_alt2.xml
Datafile:muct.csv
|---------------------------------------------------|
|   Hits  |  Misses  | False Detects  | Multi-hit   |
|  3635   |   55     |   63           |    5        |
|---------------------------------------------------|
Time:4m2.060s

vs:

Cascade:lbpcascade_frontalface.xml
Datafile:muct.csv
|---------------------------------------------------|
|   Hits  |  Misses  | False Detects  | Multi-hit   |
| 3569    |  106     |   77           |    3        |
|---------------------------------------------------|
Time:1m12.511s

May be it will useful for you:

There is a Simd Library, which has an implementation of HAAR and LBP cascade classifiers. It can use standard HAAR and LBP casscades from OpenCV. This implementation has SIMD optimizations with using of SSE4.1, AVX2, AVX-512 and NEON(ARM), so it works in 2-3 times faster then original OpenCV implementation.


My personal opinion is that you should look into LBP for all detection related tasks simply because LBP training can take minutes while HAAR training can take days for the same training data set and parameters.

The question you have asked will have a different performance depending on the type of thing being detected, the training settings and the parameters used during detection as well as the criteria for testing the cascades.

The accuracy of both HAAR and LBP cascades depend on the data sets (positive and negative samples) used for training them and the parameters used during training.

according to Lienhart et al, 2002, in the case of face detection:

  • your -numStages, -maxDepth and -maxWeakCount parameters should be sufficiently high to achieve the desired -minHitRate and -maxFalseAlarmRate.
  • tree based training is more accurate than stump based,
  • gentle adaboost is preferable to discrete and real adaboost,
  • the min size of training sample matters but a systematic study about it has yet to be done.

also, flags used in detectMultiScale() yield a drastic change in speed as well as accuracy on a given hardware configuration.

for testing the cascade you should settle on a data set and a method such as k-fold cross validation.


LBP is faster (a few times faster) but less accurate. (10-20% less than Haar).

If you want to detect faces on an embedded system, LBP is the default choice, because it does its calculations in integers.

Haar uses floats for processing, which have poorer support on embedded and mobile processors; as a result, the performance penalty is significant - big enough to make its usage on mobile phones impractical.