Universal Video Style Transfer via Crystallization, Separation, and Blending

Haofei Lu; Zhizhong Wang

doi:10.24963/ijcai.2022/687

Universal Video Style Transfer via Crystallization, Separation, and Blending

Haofei Lu, Zhizhong Wang

Watch video

Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence

AI and Arts. Pages 4957-4965. https://doi.org/10.24963/ijcai.2022/687

PDF BibTeX

Universal video style transfer aims to migrate arbitrary styles to input videos. However, how to maintain the temporal consistency of videos while achieving high-quality arbitrary style transfer is still a hard nut to crack. To resolve this dilemma, in this paper, we propose the CSBNet which involves three key modules: 1) the Crystallization (Cr) Module that generates several orthogonal crystal nuclei, representing hierarchical stability-aware content and style components, from raw VGG features; 2) the Separation (Sp) Module that separates these crystal nuclei to generate the stability-enhanced content and style features; 3) the Blending (Bd) Module to cross-blend these stability-enhanced content and style features, producing more stable and higher-quality stylized videos. Moreover, we also introduce a new pair of component enhancement losses to improve network performance. Extensive qualitative and quantitative experiments are conducted to demonstrate the effectiveness and superiority of our CSBNet. Compared with the state-of-the-art models, it not only produces temporally more consistent and stable results for arbitrary videos but also achieves higher-quality stylizations for arbitrary images.

Keywords:

Theory and philosophy of arts and creativity in AI systems: Autonomous creative or artistic AI

Methods and resources: Machine learning, deep learning, neural models, reinforcement learning