AI Lacks Sensory Understanding Of Flowers Study Reveals
A new study from The Ohio State University reveals that prominent large language models (LLMs) like ChatGPT face challenges in understanding and describing flowers with the same depth as humans. These AI models are primarily trained using language and image data, lacking direct experience with sensory activities such as touch or smell.
Representative image. Credit: estt/iStock
Unpacking the Study's Discoveries on AI Perception
“A large language model can’t smell a rose, touch the petals of a daisy or walk through a field of wildflowers,” stated Qihui Xu, lead author of the study and a postdoctoral researcher in psychology at The Ohio State University. “Without those sensory and motor experiences, it can’t truly represent what a flower is in all its richness. The same is true of some other human concepts,” he elaborated.
Xu and colleagues conducted a comparative analysis involving humans and various LLMs, specifically two models each from OpenAI (GPT-3.5 and GPT-4) and Google (PaLM and Gemini). Their research utilized a knowledge base of 4,442 words, including terms like ‘flower’, ‘hoof’, ‘humorous’, and ‘swing’.
How Researchers Tested AI's Grasp of Concepts
Two primary measures were employed to evaluate both human and LLM understanding. The first, known as Glasgow Norms, required ratings of words across nine dimensions, such as arousal, concreteness, and imageability. For example, this measure assesses how emotionally arousing a flower is or how readily one can form a mental image of it.
The second measure, Lancaster Norms, investigated the connection between word concepts and sensory information (touch, hearing, smell, vision), as well as motor activities involving the mouth, hands, arms, and torso. This might involve rating how much one experiences flowers through smell or through actions involving the torso.
The researchers aimed to understand how LLMs and humans correlate concepts. The first analysis explored whether humans and AI generally agree on properties like the emotional arousal of different concepts. The second analysis focused on the ability of humans and LLMs to determine how words are interconnected. For instance, while both pasta and roses possess strong smells, humans perceive pasta as more similar to noodles than to roses. This distinction arises because human comparison isn't based solely on one sense like smell but also incorporates visual and taste information.
Key Differences in Human vs AI Understanding
Overall, the LLMs demonstrated a strong correlation with human concepts. However, significant discrepancies emerged when AI models were required to describe aspects linked to sensory experiences, such as taste. In these areas, AI failed to fully capture human-like conceptual understanding.
“From the intense aroma of a flower, the vivid silky touch when we caress petals, to the profound joy evoked, human representation of ‘flower’ binds these diverse experiences and interactions into a coherent category,” the researchers noted in their paper.
Xu further commented, “They obtain what they know by consuming vast amounts of text – orders of magnitude larger than what a human is exposed to in their entire lifetimes – and still can’t quite capture some concepts the way humans do.”
The Evolving Future of AI and Sensory Experience
Despite these findings, Xu highlighted that LLMs are continuously improving and are expected to become better at capturing human-like emotions and concepts over time.
Future developments, particularly the integration of LLMs with sensor data and robotics, are anticipated to enhance their reasoning abilities and enable them to act more appropriately within the physical world.
The complete findings of the study were published last week in the journal Nature Human Behavior.