

We developed a behavioral task ( M-Pong, reminiscent of the computer game called Pong) that aims to probe the ability of humans and monkeys to flexibly and rapidly predict the future state of a previously learned rich physical world. We trained RNN models to solve the same task as humans and monkeys (see panel C), but additionally optimized some models to perform dynamic inference of the latent position of the ball.

We tested the mental simulation hypothesis by directly comparing the behavior of humans, monkeys, and task-optimized recurrent neural network models. Alternatively, the brain could support such inferences via automatized nonlinear function approximations, without explicitly tracking latent environmental states. In making an inference about the future state of this world, the brain could form a dynamic inference engine, i.e., dynamically track latent environmental states to generate behavioral outputs. To illustrate this, the top panel depicts observations of the external world, a ball falling towards the ground. 1A).Ī Physical predictions. A dominant theory in cognitive science is that humans make inferences about physical processes using “mental simulations” of the physical world. Concretely, the mental simulation hypothesis predicts that the nervous system makes inferences in the absence of sensory input by forming a dynamic inference engine that can internally track latent environmental states (see Fig. Mental simulation is broadly defined as the capacity to imagine “what will or what could be” 4, and is also thought to underlie other cognitive functions such as imagination 5, 6 and counterfactual reasoning 7. Despite the centrality of these capacities in human intelligence, the underlying computations remain unknown.Ī dominant theory is that the brain constructs mental models of the physical world and relies on mental simulations of those models for making inferences 1, 2, 3. This understanding helps us infer the latent states of objects and events, predict plausible and implausible future states, plan intervening actions, and anticipate the consequences of those actions. Moreover, our work highlights a general strategy for using model neural systems to test computational hypotheses of higher brain function.įrom just a few glances, we can parse the structure of a novel scene, generate a rich understanding of its components, and use this understanding to make general inferences and predictions 1.

This primate behavioral pattern is best captured by RNNs endowed with dynamic inference, consistent with the hypothesis that the primate brain uses dynamic inferences to support flexible physical predictions. Humans and monkeys exhibit similar behavioral patterns. Here, we test this hypothesis by directly comparing the behavior of primates (humans and monkeys) in a ball interception task to that of a large set of recurrent neural network (RNN) models with or without the capacity to dynamically track the underlying latent variables. However, evidence supporting this hypothesis is limited to behavioral models that do not emulate neural computations. This ability is hypothesized to rely on establishing mental models of the external world and running mental simulations of those models. Primates can richly parse sensory inputs to infer latent information.
