2025.findings-emnlp.641@ACL

Total: 1

#1 Semantic Geometry of Sentence Embeddings [PDF1] [Copy] [Kimi1] [REL]

Author: Matthieu Tehenan

Sentence embeddings are central to modern natural language processing, powering tasks such as clustering, semantic search, and retrieval-augmented generation. Yet, they remain largely opaque: their internal features are not directly interpretable, and users lack fine-grained control for downstream tasks. To address this issue, we introduce a formal framework to characterize the organization of features in sentence embeddings through information-theoretic means. Building on this foundation, we develop a method to identify interpretable feature directions and show how they can be composed to capture richer semantic structures. Experiments on both synthetic and real-world datasets confirm the presence of this semantic geometry and highlight the utility of our approach for enhancing interpretability and fine-grained control in sentence embeddings.

Subject: EMNLP.2025 - Findings