Scanpath Prediction for Visual Attention using IOR-ROI LSTM

Scanpath Prediction for Visual Attention using IOR-ROI LSTM

Zhenzhong Chen, Wanjie Sun

Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence
Main track. Pages 642-648. https://doi.org/10.24963/ijcai.2018/89

Predicting scanpath when a certain stimulus is presented plays an important role in modeling visual attention and search. This paper presents a model that integrates convolutional neural network and long short-term memory (LSTM) to generate realistic scanpaths. The core part of the proposed model is a dual LSTM unit, i.e., an inhibition of return LSTM (IOR-LSTM) and a region of interest LSTM (ROI-LSTM), capturing IOR dynamics and gaze shift behavior simultaneously. IOR-LSTM simulates the visual working memory to adaptively integrate and forget scene information. ROI-LSTM is responsible for predicting the next ROI given the inhibited image features. Experimental results indicate that the proposed architecture can achieve superior performance in predicting scanpaths.
Keywords:
Computer Vision: Perception
Computer Vision: Computer Vision