A Multi-Objective Approach to Mitigate Negative Side Effects

Sandhya Saisubramanian; Ece Kamar; Shlomo Zilberstein

doi:10.24963/ijcai.2020/50

A Multi-Objective Approach to Mitigate Negative Side Effects

Sandhya Saisubramanian, Ece Kamar, Shlomo Zilberstein

Short video

Long video

Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence

Main track. Pages 354-361. https://doi.org/10.24963/ijcai.2020/50

PDF BibTeX

Agents operating in unstructured environments often create negative side effects (NSE) that may not be easy to identify at design time. We examine how various forms of human feedback or autonomous exploration can be used to learn a penalty function associated with NSE during system deployment. We formulate the problem of mitigating the impact of NSE as a multi-objective Markov decision process with lexicographic reward preferences and slack. The slack denotes the maximum deviation from an optimal policy with respect to the agent's primary objective allowed in order to mitigate NSE as a secondary objective. Empirical evaluation of our approach shows that the proposed framework can successfully mitigate NSE and that different feedback mechanisms introduce different biases, which influence the identification of NSE.

Keywords:

Agent-based and Multi-agent Systems: Human-Agent Interaction

Humans and AI: Human-AI Collaboration

Planning and Scheduling: Markov Decisions Processes