2512.15365

Total: 1

#1 ArcBERT: An LLM-based Search Engine for Exploring Integrated Multi-Omics Metadata [PDF2] [Copy] [Kimi] [REL]

Authors: Gajendra Doniparthi, Shashank Balu Pandhare, Stefan Deßloch, Timo Mühlhaus

Traditional search applications within Research Data Management (RDM) ecosystems are crucial in helping users discover and explore the structured metadata from the research datasets. Typically, text search engines require users to submit keyword-based queries rather than using natural language. However, using Large Language Models (LLMs) trained on domain-specific content for specialized natural language processing (NLP) tasks is becoming increasingly common. We present ArcBERT, an LLM-based system designed for integrated metadata exploration. ArcBERT understands natural language queries and relies on semantic matching, unlike traditional search applications. Notably, ArcBERT also understands the structure and hierarchies within the metadata, enabling it to handle diverse user querying patterns effectively.

Subjects: Databases , Information Retrieval

Publish: 2025-12-17 12:11:14 UTC