INTRODUCTION:Facial applied in many fields such as medical
Posted On May 17, 2019
INTRODUCTION:Facial expression, as a powerful nonverbal channel, plays an important role for humanbeings to convey emotions and transmit messages. Automatic facial expression recognition(AFEC) can be widely applied in many fields such as medical assessment, lie detection andhuman computer interaction. AFEC has attracted great interest in the past two decades.However, facial expression analysis is a very challenging task because facial expressionscaused by facial muscle movements are subtle and transient . To capture and represent thesemovements is a key issue to be addressed in facial expression analysis. Two main streams offacial expressions analysis are widely adopted in the current research and development. Onestream is to detect facial actions. Facial expression contains a unique group of facial actionunits. The Facial Action Coding System (FACS), is the best known system developed forhuman beings to describe facial actions. Another stream of facial expression analysis is tocarry out facial affect (emotion) recognition directly. Most researchers deal with therecognition task of six universal emotions: happy, sad, fear, disgust, angry and surprise.Many efforts have been made for facial expression recognition. The methodologies used arecommonly categorized Junkai Chen and Zheru Chi are with the Department of Electronicand Information Engineering, A geometry based method captures facial configurations inwhich a set of facial bfiducial points is used to characterize the face shape.SCOPE OF THE PROJECT:Facial expression, as a powerful nonverbal channel, plays an important role for humanbeings to convey emotions and transmit messages. Automatic facial expression recognition(AFEC) can be widely applied in many fields such as medical assessment, lie detection andhuman computer interaction.LITERATURE SURVEY:1. “The INTERSPEECH 2010 Paralinguistic Challenge”, by BjornSchuller,Stefan Steidl,Anton and Batliner.Most paralinguistic analysis tasks are lacking agreed-upon evaluation procedures andcomparability, in contrast to more ‘traditional’ disciplines in speech analysis. TheINTERSPEECH 2010 Paralinguistic Challenge shall help overcome the usually lowcompatibility of results, by addressing three selected subchallenges. In the Age Sub-Challenge, the age of speakers has to be determined in four groups. In the Gender Sub-Challenge, a three-class classification task has to be solved and finally, the Affect Sub-Challenge asks for speakers’ interest in ordinal representation. This paper introduces theconditions, the Challenge corpora “aGender” and “TUM AVIC” and standard feature setsthat may be used. Further, baseline results are given.2. “Local Gabor Binary Patterns from Three Orthogonal Planes forAutomatic Facial Expression Recognition”, by R. Almaev, andMichel F. Valstar.Facial actions cause local appearance changes over time, and thus dynamic texturedescriptors should inherently be more suitable for facial action detection than their staticvariants. In this paper we propose the novel dynamic appearance descriptor Local GaborBinary Patterns from Three Orthogonal Planes (LGBP-TOP), combining the previous successof LGBPbased expression recognition with TOP extensions of other descriptors. LGBP-TOPcombines spatial and dynamic texture analysis with Gabor filtering to achieve unprecedentedlevels of recognition accuracy in real-time. While TOP features risk being sensitive tomisalignment of consecutive face images, a rigorous analysis of the descriptor shows therelative robustness of LGBPTOP to face registration errors caused by errors in rotationalalignment. Experiments on the MMI Facial Expression and Cohn- Kanade databases showthat for the problem of FACS Action Unit detection, LGBP-TOP outperforms both its staticvariant LGBP and the related dynamic appearance descriptor LBP-TOP.3 . “Expression recognition in videos using a weighted component-basedfeature descriptor”, by Xiaohua Huang, Guoying Zhao, MattiPietikainen,and Wenming Zheng.In this paper, we propose a weighted component-based feature descriptor for expressionrecognition in video sequences. Firstly, we extract the texture features and structural shapefeatures in three facial regions: mouth, cheeks and eyes of each face image. Then, wecombine these extracted feature sets using confidence level strategy. Noting that for differentfacial components, the contributions to the expression recognition are different, we propose amethod for automatically learning different weights to components via the multiple kernellearning. Experimental results on the Extended Cohn-Kanade database show that ourapproach combining component-based spatiotemporal features descriptor and weight learningstrategy achieves better recognition performance than the state of the art methods.4. “Deep Learning For Robust Feature Generation In AudiovisualEmotion Recognition”,By Yelin Kim, Honglak Lee, And EmilyMower Provost.Automatic emotion recognition systems predict high-level affective content from low-level human-centered signal cues. These systems have seen great improvements inclassification accuracy, due in part to advances in feature selection methods. However, manyof these feature selection methods capture only linear relationships between features oralternatively require the use of labeled data. In this paper we focus on deep learningtechniques, which can overcome these limitations by explicitly capturing complex non-linearfeature interactions in multimodal data. We propose and evaluate a suite of Deep BeliefNetwork models, and demonstrate that these models show improvement in emotionclassification performance over baselines that do not employ deep learning. This suggeststhat the learned highorder non-linear relationships are effective for emotion recognition.5. “Facial expression recognition based on Local Binary Patterns: Acomprehensive study”, by Caifeng Shan, Shaogang Gong,and PeterW. McOwan.Automatic facial expression analysis is an interesting and challenging problem, andimpacts important applications in many areas such as human–computer interaction and data-driven animation. Deriving an effective facial representation from original face images is avital step for successful facial expression recognition. In this paper, we empirically evaluatefacial representation based on statistical local features, Local Binary Patterns, for person-independent facial expression recognition. Different machine learning methods aresystematically examined on several databases. Extensive experiments illustrate that LBPfeatures are effective and efficient for facial expression recognition. We further formulateBoosted-LBP to extract the most discriminant LBP features, and the best recognitionperformance is obtained by using Support Vector Machine classifiers with Boosted-LBPfeatures. Moreover, we investigate LBP features for low-resolution facial expressionrecognition, which is a critical problem but seldom addressed in the existing work. Weobserve in our experiments that LBP features perform stably and robustly over a useful rangeof low resolutions of face images, and yield promising performance in compressed low-resolution video sequences captured in real-world environments.FUNCTIONAL REQUIREMENTSA functional requirement defines a function of a software-system or its component. Afunction is described as a set of inputs, the behavior, and outputs. Our system requiresminimum three systems to achieve this concept.NON-FUNCTIONAL REQUIREMENTSEFFICIENCYOur application efficiently characterizes the server and the cluster requestsand response.MODULES:1. Frame Convertion2. Feature Extraction3. Feature Pooling4. ClassificationBLOCK DIAGRAM:MODULE DESCRIPTION:MODULE 1:1. Frame Conversion :Video input convert to frame sequence.MODULE 2:2. Feature Extraction :A . Histograms of oriented gradientsHistograms of oriented gradients (HOG) were first proposed for human detection. Thebasic idea of HOG is that local object appearance and shape can often be characterizedrather well by the distribution of local intensity gradients or edge directions. HOG issensitive to object deformations. Facial expressions are caused by facial musclemovements. For example, mouth opening and raised eyebrows will generate a surprisefacial expression. These movements could be regarded as types of deformations. HOGcan effectively capture and represent these deformations 39. However, the originalHOG is limited to deal with a static image. In order to model dynamic textures from avideo sequence with HOG, we extend HOG to 3-D to compute the oriented gradients onthree orthogonal planes XY, XT, and YT (TOP), i.e. HOG-TOP. The proposed HOG-TOP is used to characterize facial appearance changes.B. Geometric feature :In this section, we introduce a more robust geometric feature namely geometric warpfeature, which is derived from the warp transform of the facial landmarks. Facial expressionsare caused by facial muscle movements. These movements result in the displacements of thefacial landmarks. Here we assume that each face image consists of many sub-regions. Thesesub-regions can be formed with triangles with their vertexes located at facial landmarks, asshown in Fig. 5. The displacements of facial landmarks cause the deformations of thetriangles. We propose to utilize the deformations to represent facial configuration changes.C. Acoustic Feature :Visual modalities (face images) and audio modalities (speech) can both convey theemotions and intentions of human beings. Audio modalities also provide some useful cluesfor affect recognition in video. For instance, with voice signal, the method 42 proposed anenhanced autocorrelation (EAC) feature for emotion recognition in video.MODULE 3:3. Feature pooling :Features from different modalities can make different contributions. Traditional SVMconcatenates different features into a single feature vector and built a single kernel for allthese different features. However, constructing a kernel for each type of features andintegrating these kernels optimally can enhance the discriminative power of these features.MODULE 4:4. Classification :To Construct An Optimal Hyperplane, SVM Employs An Iterative Training Algorithm,Which Is Used To Minimize An Error Function. According To The Form Of The ErrorFunction, SVM Models Can Be Classified Into Four Distinct Groups: Classification SVMType 1 (Also Known As C-SVM Classification).PROPOSED SYSTEM TECHNIQUE EXPLANATIONWe can see that feature extraction plays a center role on affect recognition in video.Designing an effective featureis important and meaningful. LBP-TOP is widely used formodeling dynamic textures. However, there are two limitations of LBP-TOP. One is the highdimensionality. The size of LBP-TOP coded using a uniform pattern is 59_3 10. Moreover,although LBP-TOP is robust to deal with illumination changes, it is insensitive to facialmuscle deformations. In this work, we propose a new feature called HOG-TOP,which is morecompact and effective to characterize facial appearance changes. More details on HOG-TOPcan be found in Section 3.1. In addition, configuration and shape representations play animportant role in human vision for the perceptionof facial expressions 37. We believe thatprevious works have not yet fully exploited the potentials of configuration representations.Characterizing face shape 11, 12 or measuring displacements of fiducial points 14, 38only are not sufficient to capture facial configuration changes, especially the subtle non-rigidchanges. In this work, we introduce a more robust geometric feature to capturefacialconfiguration changes.SOFTWARE REQUIREMENT:? MATLAB 7.14 Version R2012MATLABThe MATLAB high-performance language for technical computing integratescomputation, visualization, and programming in an easy-to- use environment where problemsand solutions are expressed in familiar mathematical notation.? Data Exploration ,Acquisition ,Analyzing ;Visualization? Engineering drawing and Scientific graphics? Analyzing of algorithmic designing and development? Mathematical functions and Computational functions? Simulating problems prototyping and modeling? Application development programming using GUI building environment.Using MATLAB, you can solve technical computing problems faster than with traditionalprogramming languages, such as C, C++, and FORTRAN.ADVANTAGE:? Experiments conducted on the dataset demonstrate that our approach can achieve apromising performance in facial expression recognition in video.APPLICATION:? Automatic facial expression recognition (AFEC) can be widely applied in many fieldssuch as medical assessment, lie detection and human computer interaction.CONCLUSION:Video based facial expression recognition is a challenging and long standing problem.In this paper, we exploit the potentials of audiovisual modalities and propose an effectiveframework with multiple feature fusion to handle this problem. Both the visual modalities(face images) and audio modalities (speech) are utilized in our study. A new featuredescriptor called Histogram of Oriented Gradients from Three Orthogonal Planes (HOG-TOP) is proposed to extract dynamic textures from video sequences to characterize facialappearance changes. Experiments conducted on three public databases (CK+, GEMEP-FERA2011, AFEW4.0) haveshown that HOG-TOP performs as well as a widely usedfeature LBP-TOP in representing dynamic textures fromvideo sequences. Moreover, HOG-TOP is moreeffective to capture subtle facial appearance changes and robust in dealing with facialexpression recognition in the wild. In addition, HOG-TOP is more compact. In order tocapture facial configure changes, we introduce an effective geometric feature deriving fromthe warp transform of the facial landmarks. Realizing that voice is another powerful way forhuman beings to transmit message, we also explore the role of speech and employ theacoustic feature for affect recognition in video. We applied the multiple feature fusion to dealwith facial expression recognition under labcontrolledenvironment and in the wild.Experiments conducted on two facial expression datasets, CK+ and AFEW 4.0, demonstratethat our approach can achieve a promisingperformance in facial expression recognition invideo.FUTURE ENHANCEMENT:Facial expression recognition under labcontrolledenvironment and in the wild. Experimentsconductedon two facial expression datasets, CK+ and AFEW 4.0, demonstrate that ourapproach can achieve a promising performance in facial expression recognition in video.REFERENCES:1 R. A. Calvo and S. D’Mello, “Affect Detection An Interdisciplinary Review of Models,Methods, and Their Applications,”IEEE Transactions on Affective Computing, vol. 1, pp. 18-37, 2010.2 Y. l. Tian, T. Kanade, and J. F. Cohn, “Recognizing action units for facial expressionanalysis,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 23, pp. 97-115, 2001.3 K. Scherer and P. Ekman, “Handbook of Methods in Nonverbal Behavior Research,” UK:Cambridge Univ. Press, 1982.4 J. F. Cohn and P. Ekman, “Measuring facial action,” 2005.5 P. Ekman and W. V. Friesen, “Facial Action Coding System: A Technique for theMeasurement of Facial Movement,” Consulting Psychologists Press, 1978.6 P. Ekman, W. V. Friesen, and J. C. Hager, “Facial Action Coding System: The Manual onCD ROM. A Human Face,” 2002.7 P. Ekman, “An argument for basic emotions,” Cognition & Emotion, vol. 6, pp. 169-200,1992.8 S. Z. Li and A. K. Jain, “Handbook of face recognition,” springer, 2011.9 N. Dalal and B. Triggs, “Histograms of Oriented Gradients for Human Detection,” IEEEConference on Computer Vision and Pattern Recognition, 2005, pp. 886-893.10 G. Zhao and M. Pietikainen, “Dynamic texture recognition using local binary patternswith an application to facial expressions,” IEEE Transactions on Pattern Analysis andMachine Intelligence, vol. 29, pp. 915-928, 2007.11 P. Lucey, J. F. Cohn, T. Kanade, J. Saragih, Z. Ambadar, and I. Matthews, “TheExtended Cohn-Kanade Dataset (CK+) A complete dataset for action unit and emotion-specified expression,” IEEE Conference on Computer Vision and Pattern RecognitionWorkshops (CVPRW), 2010, pp. 94-101.12 S. W. Chew, P. Lucey, S. Lucey, J. Saragih, J. F. Cohn, and S. Sridharan, “Person-independent facial expression detection using constrained local models,” IEEE InternationalConference on Automatic Face & Gesture Recognition and Workshops, 2011, pp. 915- 920.13 S. Taheri, P. Turaga, and R. Chellappa, “Towards view-invariant expression analysisusing analytic shape manifolds,” IEEE International Conference on Automatic Face &Gesture Recognition and Workshops, 2011, pp. 306-313.14 A. Saeed, A. Al Hamadi, R. Niese, and M. Elzobi, “Effective geometric features forhuman emotion recognition,” IEEE 11 th International Conference on Signal Processing(ICSP), 2012, pp. 623- 627.15 K. Sikka, T. Wu, J. Susskind, and M. Bartlett, “Exploring bag of words architectures inthe facial expression domain,” in Computer Vision-ECCV Workshops and Demonstrations,2012, pp. 250-259.16 Y. Rahulamathavan, R. C. W. Phan, J. A. Chambers, and D. J. Parish, “FacialExpression Recognition in the Encrypted Domain Based on Local Fisher DiscriminantAnalysis,” IEEE Transactions on Affective Computing, vol. 4, pp. 83-92, 2013.17 L. Zhang and D. Tjondronegoro, “Facial expression recognition using facial movementfeatures,” IEEE Transactions on Affective Computing, vol. 2, pp. 219-229, 2011.18 S. Happy and A. Routray, “Automatic facial expression recognition using features ofsalient facial patches,” IEEE Transactions on Affective Computing, vol. 6, pp. 1-12, 2015.19 M. F. Valstar, B. Jiang, M. Mehu, M. Pantic, and K. Scherer, “The first facial expressionrecognition and analysis challenge,” IEEE International Conference on Automatic Face &Gesture Recognition and Workshops, 2011, pp. 921-926.20 A. Dhall, A. Asthana, R. Goecke, and T. Gedeon, “Emotion recognition using PHOGand LPQ features,” IEEE International Conference on Automatic Face & GestureRecognition and Workshops, 2011, pp. 878-883.21 X. Huang, G. Zhao, M. Pietikainen, and W. Zheng, “Expression Recognition in VideosUsing a Weighted Component-Based Feature Descriptor,” in Proceedings of the 17thScandinavian conference on Image analysis, 2011, pp. 569-578.22 T. R. Almaev and M. F. Valstar, “Local Gabor Binary Patterns from Three OrthogonalPlanes for Automatic Facial Expression Recognition,” in Affective Computing and IntelligentInteraction (ACII), 2013, pp. 356-361.23 X. Huang, Q. He, X. Hong, G. Zhao, and M. Pietikainen, “Improved SpatiotemporalLocal Monogenic Binary Pattern for Emotion Recognition in The Wild,” in ACMInternational Conference on Multimodal Interaction, 2014, pp. 514-520.24 X. Huang, G. Zhao, W. Zheng, and M. Pietikainen, “Spatio temporal Local MonogenicBinary Patterns for Facial Expression Recognition,” IEEE Signal Processing Letters, vol. 19,pp. 243-246, 2012.25 F. Long, T. Wu, J. R. Movellan, M. S. Bartlett, and G. Littlewort, “Learningspatiotemporal features by using independent component analysis with application to facialexpression recognition,” Neurocomputing, vol. 93, pp. 126-132, 2012.