Total: 1
Large language models (LLMs) have proven successful on many machine learning tasks,including those that do not involve language generation. In specific, LLMs have been shown to be effective in solving regression, where the targets are real-numbers.One common approach is to fine tune the LLM based on the log-perplexity loss and use autoregressive sampling at the inference time. Another approach relies on adding a predictive head and finetuning it with a suitable loss. Despite the success, there has not been a study on the principled ways of using decoder LLMs for regression. In this work we compare different prior works under a unified view, and introduce RAFT, regression-aware fine-tuning, a novel approach based on the Bayes-optimal decision rule. We demonstrate how RAFT improves over established baselines on several benchmarks and model families.