Fitting Very Flexible Models: Linear Regression With Large Numbers of Parameters
Abstract
There are many uses for linear fitting; we consider here the interpolation and denoising of data, as when the goal is to fit a smooth, flexible function to a set of noisy data points. Investigators often choose a polynomial basis, or a Fourier basis, or wavelets, or something equally general. They also choose an order, or number of basis functions to fit, and (often) some kind of regularization. We discuss how this basis-function fitting is done, with ordinary least squares and extensions thereof. We emphasize that it can be valuable to choose far more parameters than data points, despite folk rules to the contrary: Suitably regularized models with enormous numbers of parameters generalize well and make good predictions for held-out data; over-fitting is not (mainly) a problem of having too many parameters. It is even possible to take the limit of infinite parameters, at which, if the basis and regularization are chosen correctly, the least-squares fit becomes the mean of a Gaussian process, or a kernel regression. We recommend cross-validation as a good empirical method for model selection (for example, setting the number of parameters and the form of the regularization), and jackknife resampling as a good empirical method for estimating the uncertainties of the predictions made by the model. We also give advice for building stable computational implementations.
- Publication:
-
Publications of the Astronomical Society of the Pacific
- Pub Date:
- September 2021
- DOI:
- arXiv:
- arXiv:2101.07256
- Bibcode:
- 2021PASP..133i3001H
- Keywords:
-
- Regression;
- Linear regression;
- Gaussian Processes regression;
- Physics - Data Analysis;
- Statistics and Probability;
- Astrophysics - Instrumentation and Methods for Astrophysics;
- Computer Science - Machine Learning
- E-Print:
- all code used to make the figures is available at https://github.com/davidwhogg/FlexibleLinearModels