An Agentic Evaluation Architecture for Historical Bias Detection in Educational Textbooks

#1 An Agentic Evaluation Architecture for Historical Bias Detection in Educational Textbooks [PDF] [Copy] [Kimi] [REL]

Authors: Gabriel Stefan, Adrian-Marius Dumitran

History textbooks often contain implicit biases, nationalist framing, and selective omissions that are difficult to audit at scale. We propose an agentic evaluation architecture comprising a multimodal screening agent, a heterogeneous jury of five evaluative agents, and a meta-agent for verdict synthesis and human escalation. A central contribution is a Source Attribution Protocol that distinguishes textbook narrative from quoted historical sources, preventing the misattribution that causes systematic false positives in single-model evaluators. In an empirical study on Romanian upper-secondary history textbooks, 83.3\% of 270 screened excerpts were classified as pedagogically acceptable (mean severity 2.9/7), versus 5.4/7 under a zero-shot baseline, demonstrating that agentic deliberation mitigates over-penalization. In a blind human evaluation (18 evaluators, 54 comparisons), the Independent Deliberation configuration was preferred in 64.8\% of cases over both a heuristic variant and the zero-shot baseline. At approximately \$2 per textbook, these results position agentic evaluation architectures as economically viable decision-support tools for educational governance.

Subjects: Artificial Intelligence , Computation and Language , Computers and Society , Multiagent Systems

Publish: 2026-04-09 06:51:32 UTC

2604.07883

#1 An Agentic Evaluation Architecture for Historical Bias Detection in Educational Textbooks [PDF] [Copy] [Kimi] [REL]