Emotion-Controllable Generalized Talking Face Generation

Sanjana Sinha; Sandika Biswas; Ravindra Yadav; Brojeshwar Bhowmick

doi:10.24963/ijcai.2022/184

Emotion-Controllable Generalized Talking Face Generation

Sanjana Sinha, Sandika Biswas, Ravindra Yadav, Brojeshwar Bhowmick

Watch video

Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence

Main Track. Pages 1320-1327. https://doi.org/10.24963/ijcai.2022/184

PDF BibTeX

Despite the significant progress in recent years, very few of the AI-based talking face generation methods attempt to render natural emotions. Moreover, the scope of the methods is majorly limited to the characteristics of the training dataset, hence they fail to generalize to arbitrary unseen faces. In this paper, we propose a one-shot facial geometry-aware emotional talking face generation method that can generalize to arbitrary faces. We propose a graph convolutional neural network that uses speech content feature, along with an independent emotion input to generate emotion and speech-induced motion on facial geometry-aware landmark representation. This representation is further used in our optical flow-guided texture generation network for producing the texture. We propose a two-branch texture generation network, with motion and texture branches designed to consider the motion and texture content independently. Compared to the previous emotion talking face methods, our method can adapt to arbitrary faces captured in-the-wild by fine-tuning with only a single image of the target identity in neutral emotion.

Keywords:

Computer Vision: Applications

Computer Vision: Machine Learning for Vision

Computer Vision: Neural generative models, auto encoders, GANs

Humans and AI: Human-Computer Interaction

Machine Learning: Applications