The complexities of the human face: analyzing facial recognition technologies in unconstrained environments

To make the long story short, there are multiple variables that need to be considered in terms of FR challenges, but I would narrow the challenges down into four main categories:

  • Person-related, that is, variations in pose, expression, and illumination
  • Device-related, that is,  using camera of different capabilities, visible vs. thermal cameras, or even surveillance vs. cell-phone cameras
  • Challenges related to FR matching algorithms, that is, commercial algorithms (black box-fixed and which we cannot change) vs. academic FR algorithms that do or do not utilize training data.
  • Challenges related to other factors, such as image quality (for example, image resolution, compression, blur), time span (facial aging), occlusion, and demographic information (for example, gender, race/ethnicity, or age). For example, FRS will behave differently when they are trained and tested using a certain cohort (such as race group) or when using different cohorts. Oh, I forgot one more thing, maybe the greatest challenge is the combination of the above challenges:

How FR is used by the military and in government at present.

There are multiple examples where FR technology is used by both the law enforcement and the military. I will start with the US-VISIT program supported by the Department of Homeland Security. US-VISIT serves DHS’s mission to “protect our nation by providing biometric identification services to federal, state and local government decision makers to help them accurately identify the people they encounter and determine whether those people pose a risk to the United States”. One of the most visible services of US-VISIT is the collection of face photos and fingerprints from international travelers at U.S. visa-issuing posts and ports of entry. Both face and fingerprint images collected help immigration officers determine whether a person is eligible to receive a visa or enter the United States. The challenge here is the amount of information that needs to be processed. However, the advantage is that biometrics are collected under controlled conditions (point of entry, indoors, cooperative subjects) and thus, recognition accuracy can be very high.

According to the one of the reports (May 2009) of the Pinellas County Sheriff’s Office in Florida about their FR program, the office is using millions of jail mug shots to double-check human identities to verify whether someone is lying about who they are. Deputies can take a photo of an individual and can then use their in-car laptop to begin a FR identity search. One of their successful stories is that in 2009 they managed to track down and arrest a bank robbery suspect using footage from a surveillance camera fed to their FR system.

There are also multi-biometric devices (even mobile ones) which combine various sensors to collect face, irises and fingerprints, which are currently used with great success in Iraq and Afghanistan. As the New York Times reported, face images (and other biometrics data) are being taken and registered from “a remarkable number of people in Afghanistan and Iraq, particularly men of fighting age.” One of the successful stories is when 475 Afghan prisoners tunneled out of the Sarposa jail in Kandahar. The U.S. military’s biometrics program helped round them back up. According to the New York Times, within days, about thirty-five escapees got picked up at checkpoints, border crossings and even a recruiting station for Afghan security forces, utilizing the facial, iris and fingerprint data collected from inmates in Afghan prisons and sent to handheld devices wielded by U.S. and Afghan forces.

Archer: Speak to how shortwave IR enhances military capabilities for human recognition in harsh environmental conditions.

Bourlai: In uncontrolled scenarios (long range recognition, operational conditions), there is a need for efficient Intelligence and Surveillance Reconnaissance (ISR) interoperability, that is, operational teams (for example, armed forces) are required to effectively manage, access and use ISR to improve command and control, and enhance information sharing and situational understanding to improve the effectiveness of operations while minimizing collateral damage in a complex environment.

For that purpose, the use of biometrics technology is very important, including FR. In terms of FR technology, in the military they are interesting, first, to be able to efficiently obtain the face biometric in various tactical situations through automatic detection of known persons of interest, and second, to use FR technology which is feasible and accurate in scenarios involving opportunistic capture of face images in difficult environments (for example, day/night time).

The problem is that most FRSs depend on the usage of face images captured in the visible range of the electromagnetic spectrum, that is, 380-750 nm. However, in real-world military scenarios they deal with harsh environmental conditions characterized by unfavorable lighting and pronounced shadows. Such an example is a night-time environment, where human recognition based solely on visible spectral images may not be feasible. In order to deal with such difficult FR scenarios, multi-spectral camera sensors are very useful because they can image day and night. Thus, recognition of faces across the infrared spectrum has become an area of growing interest.

The infrared (IR) spectrum is divided into different spectral bands. The boundaries between these bands can vary depending on the scientific field involved (for example, optical radiation, astrophysics, or sensor technology). The SWIR band is a part of the reflected IR (active) band that can range from 0.9-2.5 μm (in our studies we used a SWIR camera that ranges from 0.9-1.9 μm).

But why SWIR is so important for the military? SWIR has a longer wavelength range than NIR and is more tolerant to low levels of obscurants like fog and smoke. Depending on the object being imaged, there are differences in appearance between images sensed in the visible and the active IR band. Although regions in the SWIR band require an external light source, the advantage is that a SWIR imaging system can take advantage of sunlight, moonlight, or starlight, and can remain unobtrusive and covert since the reflected IR light is invisible to the human eye.

The main benefits in using SWIR spectrum for face recognition are the following:

The external source of illumination is invisible to the human eye, making it suitable for covert applications. It can be useful in both day and night-time environments and can provide the capability to image a target at long standoff distances, for example, 1,300 feet away (depending on the sensor and optical system).

SWIR imagery can be combined with visible-light imagery to generate a more complete image of the human face; and facial features which are not observed in the visible spectrum may be observable in the SWIR spectrum.

In addition, its proximity to the visible spectrum makes it particularly relevant for use in biometric-related face applications.

Archer: Examine the new algorithms for unconstrained recognition? What does this mean?

Bourlai: Here we have to understand what “unconstrained recognition” is, why we need to develop new algorithms, and discuss some example cases.

The performance of the most available FRSs is very good when face images are acquired under “favorable situations” or, in other words, under “constrained conditions,” that is, when we are working in a lab environment with cooperative subjects, short standoff distances, good illumination, no facial expression or occlusion etc.

The question is what happens in real-world challenging situations? Can we still use existing FR algorithms? Do we need to design them differently? Do we need to also support them with other algorithms (image enhancement, fusion) which aim to improve their baseline capabilities? The answer is yes.

One of the main FR challenges is to be able to efficiently match face images captured under completely different situations. This is considered as an “unconstrained recognition” problem that many government agencies have to deal with. Although there are many and very interesting examples, a typical real-world example of an unconstrained recognition problem is to try and match a face image of an uncooperative individual, captured at night-time using a surveillance camera (usually, near-infrared), against a previously acquired visible database of good quality images, such as mug shots, passport or other ID high quality photos.

Today I will discuss with you two real-world unconstrained recognition problems for which new algorithms had to be developed. WVU’s Multispectral Imagery Lab, which I am directing, is working on these problems, in collaboration with other researchers within or outside WVU.

The first one is the restoration of severely degraded face images before they are matched to high quality visible images. More specifically, this is an automated face recognition scenario which involves comparing degraded facial photographs (passport photos or a consequence of scanning, printing, or faxing face photos) of subjects against their high-resolution counterparts. These scenarios are encountered in situations where there is a need, for example, to identify legacy face photos acquired by a government agency and which has been faxed to another agency. Other examples include matching scanned face images present in passports (watermarks), driver’s licenses, refugee documents, and visas for the purpose of establishing or verifying a subject’s identity.

To deal with such a difficult problem, a preprocessing scheme (with low computational complexity) was developed in the lab in order to eliminate the noise present in degraded face images and restore their quality — for example, imagine a passport photo faxed that has watermarks. Our study established that the proposed restoration scheme improves the quality of the ensuing face images, while simultaneously improving the performance of face matching using various commercial and academic FR algorithms. This algorithmic package we developed can be a very useful tool to use at U.S. border control, ports of entry and customs.

The second very interesting example is the identification of people in heterogeneous environments, and in particular in situations where the face images which are matched come from sensors that operate in completely different bands — Visible vs. NIR, SWIR or Thermal. This is commonly called “cross-spectral matching” scenario and the complexity of this problem can go from just being moderate (face images of cooperative subjects are captured at different spectra under controlled indoors or semi-controlled conditions) to very severe (face images are captured outdoors at mid- or long-range standoff distances, non-cooperative subjects, expression, occlusion, night-time etc).

Even though there are so many FR commercial or academic algorithms available, their performance in such cases is not expected to be good enough. This is mainly because these algorithms were not originally designed to deal with such adverse and challenging scenarios where the nature (for example, facial characteristics) of the face images that need to be matched vary considerably.

Our multi-spectra imagery lab is working toward solving such challenging problems by trying to develop new algorithms. We have recently published a study on the advantages and limitations of matching (i) short wave infrared (SWIR) face images to visible images under controlled or uncontrolled conditions; (ii) mid-wave infrared (MWIR) to MWIR or visible images under controlled conditions; and (iii) intra-distance near infrared (NIR) to NIR images and cross-distance, cross-spectral NIR to visible images.

These problems are considered to be very important for military and law enforcement agencies.

Archer: In your opinion, how will facial recognition (FR) be used within the military and commercially in the short and long term future?

Bourlai: Let me start by saying that media giants such as Facebook, Google and Apple now include FR in their products (NIST 2011 report on Biometric Challenges), and the commercial development of low cost devices, that is, cameras with build in face detection, is underway. In the military (but also in other government agencies like law enforcement), the purpose of using FR is to enhance security in areas where it is important to verify the identity of individuals using facial images. However, in the military they know that only FR is not good enough and that it works better in combination with other security-related technologies. A current trend is to develop synergistic technologies in order to identify potentially hostile behavior and intent, in order to uncover clandestine foes.

Thus, I will provide you some areas where FR is or will be used either as a stand-alone technology or not:

Border control: There have always been pictures in passports. And for most of the world, there have been pictures in visas. There is a U.S. FR program, which is currently only used for visa applications. In this case, computer programs check applicants’ digital photographs against a database of some seventy-eight million photos to search for matches. The same photo under different names may indicate fraud. You can understand that the face part of our digital ID and better capabilities of FR technologies is needed (and accuracy and speed are very important). Also, the need for surveillance tools for protecting borders and high-risk perimeters will result in an increase of the usage of long range FR systems (these are areas on which we are working in the lab).

Physical access controlin military facilities: technology will go a step further where a threat can be detected or predicted before someone reaches the gate of a military facility. I am not talking only about short range FR, but FR in combination with, for example, long-range iris recognition at least > 100 feet away, or FR > 1,000 feet away. This is not too far away in the future! Do you remember “Minority Report”?

It is also worth mentioning that a promising avenue for extensions, even more applicable in the case of longer-range FR, has to do with the incorporation of context in the recognition process, that is,  utilizing image content outside the detected face (objects, other faces, environment, recognized text) or other sources of context, such as spatio-temporal information associated with the picture. Close to that concept, the Army is currently interested in an effort called “Tagging, Tracking, and Locating,” or TTL. And the strategy in places like Afghanistan is to develop capabilities to take out individual insurgents. Again, FR will play its role over there but always as part of an integrated system.

Another interesting and general example of where the things will move in the future is the evolvement of FR technology in combination with other technologies such as cloud computing and social net-working. A recent study at CMU warns for a potential danger. They claim that, when using Facebook and share tagged photos of ourselves online, it is possible that other Facebook users will link our face to our names without our consent, that is, in situations where anonymity is normally expected. Thus, even though I am working in the area and contribute in FR technologies for some time now, I am also concerned about what may go wrong in our everyday lives in the future. Privacy is the issue here and we need to secure ourselves from integrated technologies which everyone, anytime can use to do harm (such an identity theft).

Of ourse, all future technologies depend on the immediate or long term needs of government. Agencies remember what happened recently with the new underwear bomb plot? In a recent CBC news article, Hillary Rodham Clinton, U.S. secretary of state, stated that “The device did not appear to pose a threat to the public air service, but the plot itself indicates that these terrorists keep trying to devise more and more perverse and terrible ways to kill innocent people.” Therefore, new security measures need to be put in place and biometrics will always play an important role, enhancing airport security procedures.

Archer: Fingerprints v iris v facial recognition? What’s the ideal marker? Is it important to capture all of these unique identifiers?

Bourlai: No biometric is ideal. We know that there are advantages and disadvantages in using each biometrics trait and there are various applications where different biometric characteristics are acceptable to be used. In practice, what happens is that the nature and requirements of the application and the properties of the biometric characteristic establish the relevance of a specific biometric to an application.

There are several factors which determine how suitable a biometrics trait to be used in a particular biometric application is. Someone would expect that the choice of a biometric trait depends primarily on the matching performance(The recognition accuracy and the resources required to achieve that accuracy should meet the constraints imposed by the application.). Maturity of biometric identifier; permanence(the biometric trait of an individual should be sufficiently invariant over a period of time with respect to the matching algorithm. A trait that changes significantly over time is not a useful biometric); uniqueness(the given trait should be sufficiently different across individuals comprising the population); universality (every individual accessing the application should possess the trait). Because it is possible for a subset of users to not possess a particular biometric trait, the multimodal systems are necessary.

Facial, iris and fingerprint are very important biometrics, but we have to consider a few things when comparing them.

We can say that FR is one of the most friendly and non-invasive way of recognizing a person. On the other hand, iris and fingerprint recognition are one of the most accurate biometrics. The accuracy of FR is very sensitive to illumination, pose and expression, but faces can be easily captured at variable standoff distances and although subject cooperation is important is as necessary as in other biometrics. In the case of irises and fingerprints images must meet stringent quality criteria and if not, they are automatically rejected at acquisition time.

Another issue is that FR faces more than the other two biometrics is a very dynamic biometric and changes over time (hair, aging). For iris, there are no current studies to statistically prove that irises are stable for a lifetime. I have not talked about fingerprints and it is important to say that the FBI has been keeping fingerprint databases for more than 100 years and the technology is very mature and accurate especially when working in favorable situations.

In terms of missing a certain biometric, there are cases where someone is missing an eye, or some fingers and that can be a problem. Face can plays a more important role in such cases. In some cultures there is substantial resistance to fingerprint acquisition (this factor is also called “acceptability”). So, the collection and usage of biometrics where no contact is required for their acquisition make them more attractive. Face and iris in general are better, although there are cultures in which the full face of someone is not exposed to be captured. Also, interoperability plays a role here.

What is the bottom line? Face, iris and fingerprint each provides some unique advantages and thus it is very important to use multiple technologies, capture all of these unique identifiers (with each biometric sample making up for the shortcomings of the other two) and combine their capabilities (for example, which one provides the highest rank) to increase the chances of identifying an individual.

Now, government agencies already understand the importance of using all three biometrics, and now you can find both commercial and military systems which are mobile and can capture all of them in one single device.

For example, according to a senior Defense Department official, there is a device called the Secure Electronic Enrollment Kit, or SEEK, which is a handheld biometrics recorder that takes iris scans, fingerprints and facial scans and ports them back to an FBI database in West Virginia in seconds (actually, we have such a device here at WVU).

And it is very promising that it just takes a few seconds to compare only your fingerprints to the two million which are in a database of terrorists, sex offenders, criminals with outstanding arrest warrants and others. This is according to NLETS press in 2011, in which they say that this is what happens “If you get stopped by the police in Houston” and they get your fingerprints. This is “part of the FBI’s new nationwide Next-Generation Identification system that eventually will employ a host of new technologies to more quickly and accurately identify criminal suspects.” In 2013, the same system NGI is scheduled to add FR (starting with mug shots and driver’s license photos), including automated comparisons of scars and body markings capabilities. This is because in the past 2-3 years the FR performance when matching such images has improved substantially. Again, in 2013 the FBI is expected to begin experimenting with iris recognition.

Note: Bourlai wishes to acknowledge the contribution of Jain, Ross, Hornak, Cukic, Kalka, and other WVU students. He will present at IDGA’s Biometrics & Identity Management Summit, 20-22 August 2012, in Arlington, Virginia.

Chris Archer, the online content editor at IDGA (the Institute for Defense & Government Advancement)