Active Visual Exploration Based on Attention-Map Entropy

Adam Pardyl; Grzegorz Rypeść; Grzegorz Kurzejamski; Bartosz Zieliński; Tomasz Trzciński

doi:10.24963/ijcai.2023/145

Active Visual Exploration Based on Attention-Map Entropy

Adam Pardyl, Grzegorz Rypeść, Grzegorz Kurzejamski, Bartosz Zieliński, Tomasz Trzciński

Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence

Main Track. Pages 1303-1311. https://doi.org/10.24963/ijcai.2023/145

PDF BibTeX

Active visual exploration addresses the issue of limited sensor capabilities in real-world scenarios, where successive observations are actively chosen based on the environment. To tackle this problem, we introduce a new technique called Attention-Map Entropy (AME). It leverages the internal uncertainty of the transformer-based model to determine the most informative observations. In contrast to existing solutions, it does not require additional loss components, which simplifies the training. Through experiments, which also mimic retina-like sensors, we show that such simplified training significantly improves the performance of reconstruction, segmentation and classification on publicly available datasets.

Keywords:

Computer Vision: CV: Machine learning for vision

Machine Learning: ML: Attention models

Robotics: ROB: Robotics and vision