End-to-End Learning to Follow Language Instructions with Compositional Policies

Abstract

We develop an end-to-end model for learning to follow language instructions with compositional policies. Our model combines large language models with pretrained compositional value functions [Nangue Tasse et al., 2020] to generate policies for goal-reaching tasks specified in natural language. We evaluate our method in the BabyAI [Chevalier-Boisvert et al., 2019] environment and demonstrate compositional generalization to novel combinations of task attributes. Notably our method generalizes to held-out combinations of attributes, and in some cases can accomplish those tasks with no additional learning samples.

Publication
Workshop on Language and Robot Learning @ CORL
Geraud Nangue Tasse
Geraud Nangue Tasse
Associate Lecturer

I am an IBM PhD fellow interested in reinforcement learning (RL) since it is the subfield of machine learning with the most potential for achieving AGI.

Steven James
Steven James
Deputy Lab Director

My research interests include reinforcement learning and planning.

Benjamin Rosman
Benjamin Rosman
Lab Director

I am a Professor in the School of Computer Science and Applied Mathematics at the University of the Witwatersrand in Johannesburg. I work in robotics, artificial intelligence, decision theory and machine learning.