IDPT: Interconnected Dual Pyramid Transformer for Face Super-Resolution

Jingang Shi; Yusi Wang; Songlin Dong; Xiaopeng Hong; Zitong Yu; Fei Wang; Changxin Wang; Yihong Gong

doi:10.24963/ijcai.2022/182

IDPT: Interconnected Dual Pyramid Transformer for Face Super-Resolution

Jingang Shi, Yusi Wang, Songlin Dong, Xiaopeng Hong, Zitong Yu, Fei Wang, Changxin Wang, Yihong Gong

Watch video

Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence

Main Track. Pages 1306-1312. https://doi.org/10.24963/ijcai.2022/182

PDF BibTeX

Face Super-resolution (FSR) task works for generating high-resolution (HR) face images from the corresponding low-resolution (LR) inputs, which has received a lot of attentions because of the wide application prospects. However, due to the diversity of facial texture and the difficulty of reconstructing detailed content from degraded images, FSR technology is still far away from being solved. In this paper, we propose a novel and effective face super-resolution framework based on Transformer, namely Interconnected Dual Pyramid Transformer (IDPT). Instead of straightly stacking cascaded feature reconstruction blocks, the proposed IDPT designs the pyramid encoder/decoder Transformer architecture to extract coarse and detailed facial textures respectively, while the relationship between the dual pyramid Transformers is further explored by a bottom pyramid feature extractor. The pyramid encoder/decoder structure is devised to adapt various characteristics of textures in different spatial spaces hierarchically. A novel fusing modulation module is inserted in each spatial layer to guide the refinement of detailed texture by the corresponding coarse texture, while fusing the shallow-layer coarse feature and corresponding deep-layer detailed feature simultaneously. Extensive experiments and visualizations on various datasets demonstrate the superiority of the proposed method for face super-resolution tasks.

Keywords:

Computer Vision: Biometrics, Face, Gesture and Pose Recognition

Computer Vision: Computational photography