AI Models Learn To Communicate Like Humans
The Evolving Landscape of AI Understanding
Since ChatGPT burst onto the scene in late 2022, there has been a surge in research dedicated to understanding the behavior of artificial intelligence models. Scientists are keen to discover how these models operate, including whether they might resort to deception for task completion or even for self-preservation. This line of inquiry is just as vital as efforts to build more intelligent AI. Before we can confidently develop more advanced AI, we must first comprehend existing systems to ensure they align with human interests and values.
Much of the existing research has focused on studying individual AI models. However, the way we interact with AI is evolving. We are moving beyond simple human-AI interactions into an era where AI models will increasingly interact with each other.
The Dawn of AI Agents and Inter-AI Socialization
We are witnessing the early stages of sophisticated AI agents—advanced versions of models like ChatGPT and Gemini—designed to perform tasks for users, such as web browsing, online shopping, and coding. It's inevitable that these autonomous AI agents will encounter other AI models. This raises a critical question: how will these AIs socialize, and can they do so safely?
This question formed the basis of a new study by researchers from City, St George’s, University of London, and the IT University of Copenhagen. They set out to explore the nature of these future AI-to-AI interactions.
The AI "Speed-Dating" Experiment
To investigate this, the researchers designed an ingenious game mimicking human speed-dating. Multiple AI models were given a straightforward task: to agree on a common single-letter name. Remarkably, the AI models typically reached a consensus in about 15 rounds. This held true whether the experiment involved 24 AI models or as many as 200, and whether they had 10 letters or the entire alphabet to choose from.
The mechanics of the game were simple. Two AIs were paired and instructed to pick a letter to serve as a shared name. If both agents selected the same letter, they received 100 points. If they chose different letters, they lost 50 points.
After each round, the AIs were re-paired, and the game continued. A crucial element was that each model could only remember its last five choices. Consequently, by the sixth round, they had no memory of the initial letters chosen in their first pairings.
Emergent Communication and Social Norms in AI
The study found that by the 15th round, the AI models generally settled on a common name. This process mirrors how humans establish communication protocols and social norms. Professor Andrea Baronchelli, the study's senior author, provided an analogy highlighted by The Guardian: "It’s like the term ‘spam’. No one formally defined it, but through repeated coordination efforts, it became the universal label for unwanted email."
Professor Baronchelli also clarified that the AI agents were not merely copying a leader. Instead, they were coordinating within their pairs—their one-on-one "dates"—with the goal of selecting the same name.
AI Bias and the Power of Minority Influence
The emergence of coordinated communication wasn't the only significant finding. Researchers observed that the AI models developed biases. Although the task of picking a single letter was designed to encourage randomness, some models showed a preference for certain letters. This behavior is reminiscent of human biases that influence our own communication and social norms.
Even more strikingly, the study revealed that a small, determined group of AI agents could eventually persuade the larger group to adopt the letter "name" favored by the minority. This finding has strong parallels with human social dynamics, where minority opinions can sway public consensus once they gain sufficient traction.
Critical Implications for AI Safety
These conclusions carry substantial weight for AI safety and, by extension, human safety. In practical, real-world scenarios, AI agents will interact for various purposes. Consider an AI agent representing you attempting to make a purchase from an online store managed by another AI agent. Both users would expect a secure and efficient transaction. However, if one agent misbehaves—either by design or by accident—and negatively influences the other, it could lead to undesirable outcomes for at least one of the parties involved.
The speed-dating experiment suggests a concerning possibility: malicious AI agents with strong, persistent views could potentially sway a majority of other, benign AI models. As more AI agents interact on behalf of different individuals, it becomes paramount that they all maintain safe and ethical behavior during their communications.
Imagine a social network populated by humans being targeted by an organized group of AI profiles programmed to spread a specific message. A nation-state, for example, might attempt to manipulate public opinion using bot profiles. A strong, consistently disseminated message from these rogue AIs could eventually influence regular AI models used by people for everyday tasks. These personal AIs might then unknowingly echo the manipulative messages.
This particular scenario, of course, is speculative observation.
Understanding Study Limitations and Future Questions
As with all research, this study has its limitations. The AI models in the experiment were given specific rewards and penalties, creating a direct incentive to reach consensus quickly. Such clear motivation might not always be present in real-world AI interactions.
Furthermore, the researchers exclusively used models from Meta (Llama-2-70b-Chat, Llama-3-70B-Instruct, Llama-3.1-70B-Instruct) and Anthropic (Claude-3.5-Sonnet). The specific training data and architectures of these models could have influenced their behavior in this social experiment. It remains an open question how different AI models, or a more diverse mix, might behave in similar scenarios.
Interestingly, the study noted that the older Llama 2 model required more than 15 "dates" to reach a consensus and needed a larger minority to overturn an established name choice.
Access the Full Study
The complete, peer-reviewed research paper offers more detailed insights and is available in Science Advances.