2504.21605

Total: 1

#1 RDF-Based Structured Quality Assessment Representation of Multilingual LLM Evaluations [PDF] [Copy] [Kimi] [REL]

Authors: Jonas Gwozdz, Andreas Both

Large Language Models (LLMs) increasingly serve as knowledge interfaces, yet systematically assessing their reliability with conflicting information remains difficult. We propose an RDF-based framework to assess multilingual LLM quality, focusing on knowledge conflicts. Our approach captures model responses across four distinct context conditions (complete, incomplete, conflicting, and no-context information) in German and English. This structured representation enables the comprehensive analysis of knowledge leakage-where models favor training data over provided context-error detection, and multilingual consistency. We demonstrate the framework through a fire safety domain experiment, revealing critical patterns in context prioritization and language-specific performance, and demonstrating that our vocabulary was sufficient to express every assessment facet encountered in the 28-question study.

Subjects: Computation and Language , Artificial Intelligence , Information Retrieval

Publish: 2025-04-30 13:06:40 UTC