2506.06926

Total: 1

#1 Basis Transformers for Multi-Task Tabular Regression [PDF] [Copy] [Kimi] [REL]

Authors: Wei Min Loh, Jiaqi Shang, Pascal Poupart

Dealing with tabular data is challenging due to partial information, noise, and heterogeneous structure. Existing techniques often struggle to simultaneously address key aspects of tabular data such as textual information, a variable number of columns, and unseen data without metadata besides column names. We propose a novel architecture, \textit{basis transformers}, specifically designed to tackle these challenges while respecting inherent invariances in tabular data, including hierarchical structure and the representation of numeric values. We evaluate our design on a multi-task tabular regression benchmark, achieving an improvement of 0.338 in the median R2 score and the lowest standard deviation across 34 tasks from the OpenML-CTR23 benchmark. Furthermore, our model has five times fewer parameters than the best-performing baseline and surpasses pretrained large language model baselines -- even when initialized from randomized weights.

Subject: Machine Learning

Publish: 2025-06-07 21:29:25 UTC