Abstract

We introduce Watch-And-Help (WAH), a challenge for testing social intelligence in agents. In WAH, an AI agent needs to help a human-like agent perform a complex household task efficiently. To succeed, the AI agent needs to i) understand the underlying goal of the task by watching a single demonstration of the human-like agent performing the same task (social perception), and ii) coordinate with the human-like agent to solve the task in an unseen environment as fast as possible (human-AI collaboration). For this challenge, we build VirtualHome- Social, a multi-agent household environment, and provide a benchmark including both planning and learning based baselines. We evaluate the performance of AI agents with the human-like agent as well as with real humans using objective metrics and subjective user ratings. Experimental results demonstrate that the proposed challenge and virtual environment enable a systematic and scalable evaluation on important aspects of machine social intelligence.

Video

Qualitative Results

We show a set of interesting behaviors that emerge from the helping agents

Revealing beliefs: Agents can help by revealing information.

False beliefs: The helper behavior changes the environment without other agents noticing, creating false beliefs.

Adversarial Helpers: When the helper agent misunderstand the task goals, they may engage in behavior that hindersthe main agent.

Code

Watch-And-Help	Simulator

Code to the models, planner baselines and environments used in the Watch-And-Help Challenge	Code to the VirtualHome MultiAgent Simulator

Paper and Supplementary Material

X. Puig, T. Shu, S. Li, Z. Wang, Y. Liao, J. Tenenbaum, S. Fidler, A. Torralba
Watch-And-Help: A Challenge For Social Perception and Human-AI Collaboration.
In Cooperative AI Workshop at NeurIPS 2020 - Best Paper Award
In ICLR 2021 - Spotlight
Camera Ready

[Bibtex]

This template is borrowed from Phillip Isola and Richard Zhang