Interactive Grounded Language Understanding in a Collaborative Environment
Human intelligence has the remarkable ability to quickly adapt to new tasks and environments. Starting from a very young age, humans acquire new skills and learn how to solve new tasks either by imitating the behavior of others or by following provided natural language instructions. To facilitate research in this direction, we propose NeurIPS IGLU competition: Interactive Grounded Language Understanding in a Collaborative Environment.
The primary goal of the competition is to approach the problem of how to build interactive agents that learn to solve a task while provided with grounded natural language instructions in a collaborative environment. Understanding the complexity of the challenge, we split it into sub-tasks to make it feasible for participants.
This research challenge is naturally related, but not limited, to two fields of study: Natural Language Understanding and Generation (NLU/G) and Reinforcement Learning (RL). Therefore, the suggested challenge can bring two communities together to approach one of the important challenges in AI. Another important aspect of the challenge is the dedication to perform a human-in-the-loop evaluation as a final evaluation for the agents developed by contestants.
The goal of our competition is to approach the following scientific challenge: How to build interactive agents that learn to solve a task while provided with grounded natural language instructions in a collaborative environment? By interactive agent we mean that the agent is able to follow the instructions correctly, is able to ask for clarification when needed, and is able to quickly adapt newly acquired skills, just like humans are able to do while collaboratively interacting with each other. An example of such grounded collaboration is presented in Figure 1.
Tasks and Application Scenarios
Given the current state of the field, our main research challenge might be too complex to suggest a reasonable end-to-end solution. Therefore, we split the problem into the following concrete research questions, which correspond to separate tasks that can be used to study each component individually before joining all of them into one system:
RQ1: How to teach?
In other words, what is the best strategy for an Architect when instructing a Builder agent, such that the concept is reasonably explained?
RQ2: How to learn?
That is, what methods should be used to train a Builder that can follow given instructions from an Architect? This question can be further split into two sub-questions:
RQ2.1: How is a ‘silent’ Builder able to learn?
A silent Builder follows instructions without the ability to ask for any clarification from the Architect.
RQ2.2: How is an ‘interactive’ Builder able to learn?
An interactive Builder can ask clarifying questions to the Architect to gain more information about the task in case of uncertainty.
Figure 2 details the general structure of the IGLU competition, which can be split into two main stages:
Stage 1: Training Period (Maximum 5 submissions per day);
Stage 2: Final Human-in-the-loop evaluation, which is fully performed by the organizers.
Figure 2: The general flow of the IGLU competition which consists of two main stages: (1) Training period; (2) Final human-in-the-loop evaluation.
July 2 – Stage 1 begins;
October 1 – Stage 1 ends;
October 22 – Stage 2 begins by deploying the top-3 performing agents for human evaluation;
November 26 – The results of Stage 2 are posted, and the list of winning teams per task is released;
December 6 – NeurIPS 2021 begins.