Sparse smooth additive regression models with shape constraints
Shape-constrained smooth additive regression models are a flexible and interpretable tool for modeling complex data while enforcing monotonicity or convexity. Variable selection is crucial in these models because their performance and interpretability can decrease in high-dimensional settings. We address sparsity through the best subset variable selection problem, which identifies the k most informative covariates by adding a set of constraints with binary variables to the model estimation. In our setting, shape-constrained models are estimated using a conic optimization approach. As a result, best subset selection leads to a mixed-integer conic program with semidefinite variables in the constraints, which is not tractable for off-the-shelf solvers. To address this challenge, we work with the perspective formulation and its continuous relaxation, which is tighter than in big-M formulations. We evaluate the proposed method on simulated and real datasets, obtaining promising results.
Keywords: shape-constrained regression feature selection mathematical optimization B-splines