Reinforcement Learning

Hierarchical Subtask Discovery with Non-negative Matrix Factorization

Hierarchical reinforcement learning methods offer a powerful means of planning flexible behavior in complicated domains. However, …

Adam Earle, Andrew Saxe, Benjamin Rosman

Belief Reward Shaping in Reinforcement Learning

A key challenge in many reinforcement learning problems is delayed rewards, which can significantly slow down learning. Although reward …

Ofir Marom, Benjamin Rosman

Online Constrained Model-based Reinforcement Learning

Applying reinforcement learning to robotic systems poses a number of challenging problems. A key requirement is the ability to handle …

Benjamin van Niekerk, Andreas Damianou, Benjamin Rosman

Hierarchical Subtask Discovery with Non-negative Matrix Factorization

Hierarchical reinforcement learning methods offer a powerful means of planning flexible behavior in complicated domains. However, …

Adam Earle, Andrew Saxe, Benjamin Rosman

Hierarchy Through Composition with Multitask LMDPs

Hierarchical architectures are critical to the scalability of reinforcement learning methods. Most current hierarchical frameworks …

Andrew Saxe, Adam Earle, Benjamin Rosman

A Bayesian approach for Learning and Tracking Switching, Non-stationary Opponents

In many situations, agents are required to use a set of strategies (behaviors) and switch among them during the course of an …

Pablo Hernandez-Leal, Benjamin Rosman, Matthew Taylor, L Enrique Sucar, Enrique Munoz de Cote

Bayesian Policy Reuse

A long-lived autonomous agent should be able to respond online to novel instances of tasks from a familiar domain. Acting online …

Benjamin Rosman, Majd Hawasly, Subramanian Ramamoorthy

Identifying and Tracking Switching, Non-stationary Opponents: a Bayesian Approach

In many situations, agents are required to use a set of strategies (behaviors) and switch among them during the course of an …

Pablo Hernandez-Leal, Matthew Taylor, Benjamin Rosman, L Enrique Sucar, Enrique Munoz de Cote

Reinforcement Learning with Parameterized Actions

We introduce a model-free algorithm for learning in Markov decision processes with parameterized actions—discrete actions with …

Warwick Masson, Pravesh Ranchod, George Konidaris

Nonparametric Bayesian Reward Segmentation for Skill Discovery Using Inverse Reinforcement Learning

We present a method for segmenting a set of unstructured demonstration trajectories to discover reusable skills using inverse …

Pravesh Ranchod, Benjamin Rosman, George Konidaris