Recurrent Existence Determination Through Policy Optimization

Recurrent Existence Determination Through Policy Optimization

Baoxiang Wang

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence
Main track. Pages 3656-3662. https://doi.org/10.24963/ijcai.2019/507

Binary determination of the presence of objects is one of the problems where humans perform extraordinarily better than computer vision systems, in terms of both speed and preciseness. One of the possible reasons is that humans can skip most of the clutter and attend only on salient regions. Recurrent attention models (RAM) are the first computational models to imitate the way humans process images via the REINFORCE algorithm. Despite that RAM is originally designed for image recognition, we extend it and present recurrent existence determination, an attention-based mechanism to solve the existence determination. Our algorithm employs a novel $k$-maximum aggregation layer and a new reward mechanism to address the issue of delayed rewards, which would have caused the instability of the training process. The experimental analysis demonstrates significant efficiency and accuracy improvement over existing approaches, on both synthetic and real-world datasets.
Keywords:
Machine Learning: Reinforcement Learning
Machine Learning Applications: Applications of Reinforcement Learning
Planning and Scheduling: POMDPs
Machine Learning Applications: Bio;Medicine