2606.07843

Total: 1

#1 RACT: Retrieval Augmented Column-Table Learning and Prediction for Multi-Table Schema Matching [PDF] [Copy] [Kimi] [REL]

Authors: Leonard Traeger, Enas Khwaileh, Andreas Behrend, George Karabatis

Schema matching, a critical task for integrating data from diverse sources, seeks to identify correspondences between columns across different schemas. In multi-table holistic schema matching, columns with similar semantic meaning may reside in tables with different contexts due to heterogeneous schema designs, where similarity-based techniques are inadequate. The focus of this paper is exploiting referential context into schema matching by introducing RACT learning and prediction, a self-supervised framework enabling the probabilistic retrieval of candidate tables for source columns to constrain relevant column candidates. Experiments demonstrate that this approach outperforms similarity-based baselines on matching multi-table schemas. In subsequent matching experiments, constraining the column search space via top-t tables improves both average matching precision and completeness by up to +70%.

Subjects: Databases , Information Retrieval , Machine Learning

Publish: 2026-06-05 21:08:40 UTC