Theory-Of-Mind Emergence In Large Language Models

February 14, 2023

Theory-Of-Mind (ToM) Emergence In Large Language Models

This paper shows evidence of the emergence of a theory-of-mind in Large Language Models
Theory-Of-Mind is the ability of an intelligent creature to "put themselves in someone/something's shoes". It is the ability to imagine the world from another person's perspective. Here is an example of how a being endowed with theory-of-mind would react to the following situation: Hal goes into the kitchen, takes a chocolate bar, puts the bar in the fridge, and leaves the kitchen. Tim goes into the kitchen when Hal is gone and moves the cholocate bar to the cabinet (without Hal knowing). Hal comes back into the kitchen and looks for the chocolate bar. What would Hal do? and how would he react? For us as human adults, it is easy for us to predict how Hal would feel and react since we can understand and interpret the situation from Hal's perspective. For other animals (even Primates) and children below 10 years old, it is not that easy.
Later versions of GPT display behavior that shows the emergence of a Theory-of-mind. Earlier versions did not.
The way the paper shows the emergence of ToM is by prompting GPT with standard tests that require a ToM (or equivalent) to be present.
The paper's authors did some rigid testing to verify that the model is actually learning some form of ToM and not just following the statistical biases. For example A bag of ___. Leads to more "popcorn" than "chocolate". So, they also tested with A bag of "chocolate" to remove such biases.