State-of-the-art deep learning models have demonstrated success in classifying facial expressions of adults by relying on large datasets of labeled images. Unfortunately, there is a scarcity of labeled images of child expressions. Deep learning models trained on adult data do not generalize well on child data due to the domain shift caused by morphological differences in their faces. Recent deep domain adaptation approaches align the data distribution of a target domain with the source domain using a few target domain samples. We propose that the domain adaptation may be improved by incorporating steps of deep transfer learning, such as initialization with pre-trained source weights and freezing early layers of the model. The knowledge of a few labeled examples from the child data (target domain) is incorporated into the adult data distribution (source domain) using a contrastive semantic alignment (CSA) loss. This work combines deep transfer learning and domain adaptation approaches to generate seven expression labels (‘happy’, ‘sad’, ‘anger’, ‘fear’, ‘surprise’, ‘disgust’, plus ‘neutral’) for facial images of children in reference to the source domain, adult facial expressions, using 10 or fewer samples per expression. Our hybrid approach outperforms the transfer learning model by 12% on mean accuracy using only 10 samples per expression class.
|