Total: 1
Symbolic Regression (SR) algorithms select expressions based on prediction performance while also keeping the expression lengths short to produce explainable white box models. In this context, SR algorithms can be evaluated by measuring the extent to which the expressions discovered are Pareto-optimal, in the sense of having the best R-squared score for a given expression length. This evaluation is most commonly done based on relative performance, in the sense that an SR algorithm is judged on whether it Pareto-dominates other SR algorithms selected in the analysis, without any indication on efficiency or attainable limits. In this paper, we explore absolute Pareto-optimal (APO) solutions instead, which have the optimal tradeoff between the multiple SR objectives, for 34 datasets in the widely-used SR benchmark, SRBench, by performing exhaustive search. Additionally, we include comparisons between eight numerical optimization methods. We extract, for every dataset, an APO front of expressions that can serve as a universal baseline for SR algorithms that informs researchers of the best attainable performance for selected sizes. The APO fronts provided serves as an important benchmark and performance limit for SR algorithms and is made publicly available at: https://github.com/kentridgeai/SRParetoFronts