Analysis of Automated Document Relevance Annotation for Information Retrieval in Oil and Gas Industry

2025.emnlp-industry.132@ACL

Total: 1

#1 Analysis of Automated Document Relevance Annotation for Information Retrieval in Oil and Gas Industry [PDF] [Copy] [Kimi] [REL]

Authors: João Vitor Mariano Correia, Murilo Missano Bell, João Vitor Robiatti Amorim, Jonas Queiroz, Daniel Pedronette, Ivan Rizzo Guilherme, Felipe Lima de Oliveira

The lack of high-quality test collections challenges Information Retrieval (IR) in specialized domains. This work addresses this issue by comparing supervised classifiers against zero-shot Large Language Models (LLMs) for automated relevance annotation in the oil and gas industry, using human expert judgments as a benchmark. A supervised classifier, trained on limited expert data, outperforms LLMs, achieving an F1-score that surpasses even a second human annotator. The study also empirically confirms that LLMs are susceptible to unfairly prefer technologically similar retrieval systems. While LLMs lack precision in this context, a well-engineered classifier offers an accurate and practical path to scaling evaluation datasets within a human-in-the-loop framework that empowers, not replaces, human expertise.

Subject: EMNLP.2025 - Industry Track