Learning Discriminative Features for Facial Expression Recognition

Wednesday, August 28, 2019 - 9:30am to 10:30am
Seminar Room 2277, Innovation Center

DISSERTATION DEFENSE
Department of Computer Science and Engineering
University of South Carolina

Author : Jie Cai
Advisor : Dr. Yan Tong
Date : Aug 28th, 2019
Time : 9:30 am
Place : Seminar Room 2277, Innovation Center

Abstract

Over the past few years, deep learning, e.g., Convolutional Neural Networks (CNNs) and Generative Adversarial Networks (GANs), have shown promise on facial expression recognition. However, the performance degrades dramatically especially in close-to-real-world settings due to high intra-class variations and high inter-class similarities introduced by subtle facial appearance changes, head pose variations, illumination changes, occlusions, and identity-related attributes, e.g., age, race, and gender. In this work, we developed two novel CNN frameworks and one novel GAN approach to learn discriminative features for facial expression recognition.

First, a novel island loss is proposed to enhance the discriminative power of learned deep features. Specifically, the island loss is designed to reduce the intra-class variations while enlarging the inter-class differences simultaneously. Experimental results on two posed facial expression datasets and, more importantly, two spontaneous facial expression datasets have shown that the proposed island loss outperforms the baseline CNNs with the traditional softmax loss or the center loss and achieves better or at least comparable performance compared with the state-of-the-art methods.

Second, we proposed a novel Probabilistic Attribute Tree-CNN (PAT-CNN) to explicitly deal with the large intra-class variations caused by identity-related attributes. Specifically, a novel PAT module with an associated PAT loss was proposed to learn features in a hierarchical tree structure organized according to identity-related attributes, where the final features are less affected by the attributes. We further proposed a semi-supervised strategy to learn the PAT-CNN from limited attribute-annotated samples to make the best use of available data. Experimental results on three posed facial expression datasets as well as three spontaneous facial expression datasets have demonstrated that the proposed PAT-CNN achieves the best performance compared with state-of-the-art methods by explicitly modeling attributes. Impressively, the PAT-CNN using a single model achieves the best performance on the SFEW test dataset, compared with the state-of-the-art methods using an ensemble of hundreds of CNNs.

Last, we present a novel Identity-Free conditional Generative Adversarial Network (IF-GAN) to explicitly reduce high inter-subject variations caused by identity-related attributes for facial expression recognition. Specifically, for any given input facial expression image, a conditional generative model was developed to transform it to an ``average'' identity expressive face with the same expression as the input face image. Since the generated images have the same synthetic ``average'' identity, they differ from each other only by the displayed expressions and thus, can be used for identity-free facial expression classification. Experimental results on three well-known facial expression datasets have demonstrated that the proposed IF-GAN outperforms the baseline CNN model and achieves best or at least comparable performance compared with the state-of-the-art methods.