Does $K$-fold CV based penalty perform variable selection or does it lead to $n^{1/2}$-consistency in Lasso?

#1 Does $K$ -fold CV based penalty perform variable selection or does it lead to $n^{1/2}$ -consistency in Lasso? [PDF] [Copy] [Kimi] [REL]

Authors: Mayukh Choudhury, Debraj Das

Least absolute shrinkage and selection operator or Lasso, introduced by Tibshirani (1996), is one of the widely used regularization methods in regression. It is observed that the properties of Lasso vary wildly depending on the choice of the penalty parameter. The recent results of Lahiri (2021) suggest that, depending on the nature of the penalty parameter, Lasso can either be variable selection consistent or be $n^{1/2}-$ consistent. However, practitioners generally implement Lasso by choosing the penalty parameter in a data-dependent way, the most popular being the $K$ -fold cross-validation. In this paper, we explore the variable selection consistency and $n^{1/2}-$ consistency of Lasso when the penalty is chosen based on $K$ -fold cross-validation with $K$ being fixed. We consider the fixed-dimensional heteroscedastic linear regression model and show that Lasso with $K$ -fold cross-validation based penalty is $n^{1/2}-$ consistent, but not variable selection consistent. We also establish the $n^{1/2}-$ consistency of the $K$ -fold cross-validation based penalty as an intermediate result. Additionally, as a consequence of $n^{1/2}-$ consistency, we establish the validity of Bootstrap to approximate the distribution of the Lasso estimator based on $K-$ fold cross-validation. We validate the Bootstrap approximation in finite samples based on a moderate simulation study. Thus, our results essentially justify the use of $K$ -fold cross-validation in practice to draw inferences based on $n^{1/2}-$ scaled pivotal quantities in Lasso regression.