Despite significant advances in face recognition technology, it has yet to achieve levels of accuracy required for many commercial and industrial applications. The high error rates stem from well-known sub-problems. Variation in lighting, facial expression and orientation all significantly increase error rates. In an attempt to address these issues, research has begun to focus on the use of three-dimensional face models, motivated by three main factors. Firstly, relying on geometric shape, rather than colour and texture information, systems become invariant to lighting conditions. Secondly, the ability to rotate a facial structure in three-dimensional space, allowing for compensation of variations in pose, aids those methods requiring alignment prior to recognition. Thirdly, the additional depth information in the facial surface structure, not available from twodimensional images, provides supplementary cues for recognition.
In this paper we expand on previous research involving the use of facial surface data, derived from 3D face models (generated using a stereo vision 3D camera), as a substitute for the more familiar two-dimensional images. A number of investigations have shown that three-dimensional structure can be used to aid recognition. Zhao and Chellappa use a generic 3D face model to normalise facial orientation and lighting direction in two-dimensional images, increasing recognition accuracy from
approximately 81% (correct match within rank of 25) to 100%. Similar results are witnessed in the Face Recognition Vendor Test , showing that pose correction using Romdhani et al’s technique reduces error rates when applied to the FERET database.
Blanz et al [5] take a comparable approach, using a morphable face model to aid in identification of 2D images. Beginning with an initial estimate of lighting direction and face shape, Romdhani et al iteratively alters shape and texture parameters of the morphable face model, minimising difference to the two-dimensional image. These parameters are then taken as features for identification, resulting in 82.6% correct identifications on a test set of 68 people.
Although these methods show that knowledge of three-dimensional face shape can aid normalisation for two-dimensional face recognition systems, none of the methods mentioned so far use actual three-dimensional geometric structure to perform recognition. Whereas Beumier and Acheroy [6, 7] make direct use of such information, testing various methods of matching 3D face models, although few were successful.
Curvature analysis proved ineffective, and feature extraction was not robust enough to provide accurate recognition. However, Beumier and Acheroy were able to achieve reasonable error rates using curvature values of vertical surface profiles. Verification tests carried out on a database of 30 people produced equal error rates (EER) between 7.25% and 9.0%. Hesher et al test a different method, using PCA (principal component analysis) of depth maps and euclidean distance to perform identification with 94% accuracy on 37 face models (when trained on the gallery set). Further investigation into this approach is carried out by Heseltine et al, showing how different surface representations and distance measures affect recognition, reducing the EER from 19.1% to 12.7% when applied to a difficult test set of 290 face models. However, the focus of this research has been on identifying optimum surface representations, with little regard for the advantages offered by each individual representation.
We suggest that different surface representations may be specifically suited to different capture conditions or certain facial characteristics, despite a general weakness for overall recognition. For example, curvature representations may aid recognition by making the system more robust to inaccuracies in 3D orientation yet also be highly sensitive to noise. Another representation may enhance nose shape, but lose information regarding jaw structure.
In this paper we analyse and evaluate a variety of three-dimensional fishersurface face recognition systems, each incorporating a different surface representation of facial structure. We propose a means of identifying and extracting components from the surface subspace produced by each system, such that they may be combined into a single unified subspace. Pentland et al [10] have previously examined the benefit of using multiple eigenspaces, in which specialist subspaces were constructed for various facial orientations, from which cumulative match scores were able to reduce error rates. Our approach differs in that we extract and combine individual dimensions, creating a single unified surface space, as applied to two-dimensional images in previous investigations.
No comments:
Post a Comment