すべての記事に戻る

開発者向けオファー

ImaginePro APIを50クレジット無料で体験

MidjourneyやFluxなどを活用してAIビジュアルを構築 — 無料クレジットは毎月リセットされます。

無料トライアルを開始

How AI Learns to Recognize Human Emotions in Pictures

2025-07-10Manuel G. Pascual4 分で読む
Artificial Intelligence
Machine Learning
Psychology

While machines are incapable of feeling emotions or empathizing with people, a new study reveals something startling: they behave as if they understand them. According to a study published in Royal Society Open Science, when modern large language models with multimodal capabilities are asked to rate emotions in images, their responses are remarkably similar to those of human volunteers.

Unlike traditional LLMs trained only on text, these multimodal systems are built using billions of images paired with detailed text descriptions. This creates a complex model that correlates pixels with words, allowing the AI to answer sophisticated questions about visual scenes. The researchers aimed to discover if these systems could judge the emotional content of images, a crucial step in ensuring AI responses align with human values and mitigating the risk of biased or inappropriate reactions.

The study's conclusion is clear: AI ratings show a high correlation with average human ratings. This suggests that modern AI can develop sophisticated representations of emotional concepts through natural language, even without explicit training on emotion recognition.

The Experiment: Pitting AI Against Human Judgment

To test this, researchers used three of the most advanced multimodal systems available today: OpenAI's ChatGPT-4o, Google's Gemini Pro, and Anthropic's Claude Sonnet. The models were prompted to act like a human participant in a psychological experiment.

They were then shown a series of images and asked to rate them on several scales:

  • Valence: How negative or positive the scene was (1-9).
  • Arousal: Whether it made them want to avoid or approach the scene.
  • Motivational Direction: If it provoked relaxation or alertness.
  • Basic Emotions: The extent to which the image evoked happiness, anger, fear, sadness, disgust, or surprise.

These AI-generated ratings were compared against the responses of 204 human participants who evaluated 362 photos from the NAPS database, which contains a wide range of positive, neutral, and unpleasant images.

Surprising Results: AI's Ratings Align with Humans

The findings showed a striking similarity between the judgments of machines and people. According to the study, GPT-4o's responses correlated particularly well with humans, scoring between 0.77 and 0.90 (where 1.0 is a perfect match). Claude also performed strongly with scores of 0.63-0.90, though it sometimes refused to answer due to its safety protocols. Gemini's performance was slightly lower but still demonstrated a remarkable match to human responses, with scores ranging from 0.55 to 0.86.

How Does AI Learn Emotion Without Feeling It?

How can multimodal systems achieve this without any capacity for genuine feeling? Alberto Tesolin, a co-author of the study, points to the training data. "We tend to think that image-text pairs contain purely visual semantic information, such as ‘image of a field of sunflowers,’" he explains. "Our research suggests that textual descriptions are much richer, allowing us to infer the emotional status of the person who wrote the entry."

In essence, the AI isn't understanding emotion; it's recognizing patterns in the language used to describe emotional scenes. As Professor José Miguel Fernández Dols, who was not involved in the study, notes, if a machine has access to text describing typical human reactions to certain stimuli, it can mimic those judgments. It processes the adjectives, adverbs, and verbs associated with descriptions of a particular type of image.

A Crucial Distinction: Emulation is Not Emotion

The authors stress a critical point: an AI's ability to emulate human ratings does not mean it can think or feel. Human emotional responses are complex and varied, whereas an AI provides an averaged, probabilistic response. The study states, "‘reading about emotions’ is qualitatively different from having direct emotional experiences.”

This taps into a wider, more controversial debate in the AI field. While some companies sell facial recognition systems that claim to detect emotions, much of the scientific community disputes the idea of universal emotional expressions, highlighting the significant role culture plays in how we show and interpret feelings. The researchers call for more investigation into these cultural differences.

Professor Fernández Dols concludes that these findings are a topic for reflection: "everyday language is a logical construct that can be perfectly coherent, persuasive, informative, and even emotional without any brain speaking."

元の記事を読む

プランと料金を比較

ワークロードに合ったプランを選び、ImagineProの全機能を解放しましょう。

ImaginePro料金比較
プラン料金主なポイント
スタンダード$8 / 月
  • 毎月300クレジットを付与
  • Midjourney・Flux・SDXLモデルにアクセス
  • 商用利用権を含む
プレミアム$20 / 月
  • 成長チーム向けに毎月900クレジット
  • 高い同時実行とより高速な納品
  • Slack/Telegramでの優先サポート

個別条件が必要ですか?クレジットやレート制限、導入方法を柔軟にご相談ください。

料金の詳細を見る
ImaginePro newsletter

ニュースレターを購読してください!

最新ニュースとデザインを入手するために、ニュースレターを購読してください。