Skip to main content
eScholarship
Open Access Publications from the University of California

Comparing Intuitions about Agents’ Goals, Preferences and Actions in Human Infants and Video Transformers

Abstract

Although AI has made large strides in recent years, state-of-the-art models still largely lack core components of social cognition which emerge early on in infant development. The Baby Intuitions Benchmark was explicitly designed to compare these "commonsense psychology" abilities in humans and machines. Recurrent neural network-based models previously applied to this dataset have been shown to not capture the desired knowledge. We here apply a different class of deep learning-based model, namely a video transformer, and show that it quantitatively more closely matches infant intuitions. However, qualitative error analyses show that model is prone to exploiting particularities of the training data for its decisions.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View