Learning Programs from Rewards with Relational Reinforcement Learning
By: Guillermo Puebla | School of Psychological Science, University of Bristol
Thursday 14 May 2020 at 12:30
Thursday 14 May 2020 at 12:30
Learning to parse the continuous stream of information in the world into relationships between objects is a hallmark of human cognition. Furthermore, people are able to recognize a myriad of relationships between the objects in their environment. For these reasons several researchers in Cognitive Science have sought to build models that can learn to recognize relationships between objects. However, the problem of how to select a relevant subset relations to use during learning has been less explored. I this work I integrate relational representations into the Reinforcement Learning framework with the overall goal of learning a relational policy (i.e., a program) that can be transferred to new situations through analogical inference. Importantly, the input to the learning algorithm is a relational description of the state, where only some of the relations are relevant to solve the task. The specific learning algorithm builds on a subfield of Inductive Logic Programming known as Relational Reinforcement Learning. To be used as a plausible framework for modeling human learning though, I adapted the algorithm in several ways. To show the potential of this approach I will present the results of applying it to two Atari games “Breakout” and “Pong”.