2025.findings-emnlp.640@ACL

Total: 1

#1 MPTA: MultiTask Personalization Assessment [PDF] [Copy] [Kimi] [REL]

Authors: Matthieu Tehenan, Eric Chamoun, Andreas Vlachos

Large language models are increasingly expected to adapt to individual users, reflecting differences in preferences, values, and communication styles. To evaluate whether models can serve diverse populations, we introduce MTPA, a benchmark that leverages large-scale survey data (WVS, EVS, GSS) to construct real, hyper-granular personas spanning demographics, beliefs, and values. Unlike prior benchmarks that rely on synthetic profiles or narrow trait prediction, MTPA conditions models on real personas and systematically tests their behavior across core alignment tasks. We show that persona conditioning exposes pluralistic misalignment: while aggregate metrics suggest models are truthful and safe, subgroup-specific evaluations reveal hidden pockets of degraded factuality, fairness disparities, and inconsistent value alignment. Alongside the benchmark, we release a dataset, toolkit, and baseline evaluations. MTPA is designed with extensibility and sustainability in mind: as the underlying survey datasets are regularly updated, MTPA supports regular integration of new populations and user traits.

Subject: EMNLP.2025 - Findings