Self-Supervised Mutual Learning for Dynamic Scene Reconstruction of Spiking Camera

Self-Supervised Mutual Learning for Dynamic Scene Reconstruction of Spiking Camera

Shiyan Chen, Chaoteng Duan, Zhaofei Yu, Ruiqin Xiong, Tiejun Huang

Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence
Main Track. Pages 2859-2866. https://doi.org/10.24963/ijcai.2022/396

Mimicking the sampling mechanism of the primate fovea, a retina-inspired vision sensor named spiking camera has been developed, which has shown great potential for capturing high-speed dynamic scenes with a sampling rate of 40,000 Hz. Unlike conventional digital cameras, the spiking camera continuously captures photons and outputs asynchronous binary spikes with various inter-spike intervals to record dynamic scenes. However, how to reconstruct dynamic scenes from asynchronous spike streams remains challenging. In this work, we propose a novel pretext task to build a self-supervised reconstruction framework for spiking cameras. Specifically, we utilize the blind-spot network commonly used in self-supervised denoising tasks as our backbone, and perform self-supervised learning by constructing proper pseudo-labels. In addition, in view of the poor scalability and insufficient information utilization of the blind-spot network, we present a mutual learning framework to improve the overall performance of the network through mutual distillation between a non-blind-spot network and a blind-spot network. This also enables the network to bypass constraints of the blind-spot network, allowing state-of-the-art modules to be used to further improve performance. The experimental results demonstrate that our methods evidently outperform previous unsupervised spiking camera reconstruction methods and achieve desirable results compared with supervised methods.
Keywords:
Machine Learning: Self-supervised Learning
Machine Learning: Unsupervised Learning