EmoGist: Efficient In-Context Learning for Visual Emotion Understanding

2025.findings-emnlp.116@ACL

Total: 1

#1 EmoGist: Efficient In-Context Learning for Visual Emotion Understanding [PDF] [Copy] [Kimi] [REL]

Authors: Ronald Seoh, Dan Goldwasser

In this paper, we introduce EmoGist, a training-free, in-context learning method for performing visual emotion classification with LVLMs. The key intuition of our approach is that context-dependent definition of emotion labels could allow more accurate predictions of emotions, as the ways in which emotions manifest within images are highly context dependent and nuanced. EmoGist pre-generates multiple descriptions of emotion labels, by analyzing the clusters of example images belonging to each label. At test time, we retrieve a version of description based on the cosine similarity of test image to cluster centroids, and feed it together with the test image to a fast LVLM for classification. Through our experiments, we show that EmoGist allows up to 12 points improvement in micro F1 scores with the multi-label Memotion dataset, and up to 8 points in macro F1 in the multi-class FI dataset.

Subject: EMNLP.2025 - Findings