Text/Speech-Driven Full-Body Animation

Wenlin Zhuang; Jinwei Qi; Peng Zhang; Bang Zhang; Ping Tan

doi:10.24963/ijcai.2022/863

Text/Speech-Driven Full-Body Animation

Wenlin Zhuang, Jinwei Qi, Peng Zhang, Bang Zhang, Ping Tan

Watch video

Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence

Demo Track. Pages 5956-5959. https://doi.org/10.24963/ijcai.2022/863

PDF BibTeX

Due to the increasing demand in films and games, synthesizing 3D avatar animation has attracted much attention recently. In this work, we present a production-ready text/speech-driven full-body animation synthesis system. Given the text and corresponding speech, our system synthesizes face and body animations simultaneously, which are then skinned and rendered to obtain a video stream output. We adopt a learning-based approach for synthesizing facial animation and a graph-based approach to animate the body, which generates high-quality avatar animation efficiently and robustly. Our results demonstrate the generated avatar animations are realistic, diverse and highly text/speech-correlated.

Keywords:

Humans and AI: Cognitive Systems

Agent-based and Multi-agent Systems: Human-Agent Interaction

Humans and AI: Applications

Humans and AI: Cognitive Modeling

Humans and AI: Human-Computer Interaction