Flexible Predictive Distributions from Varying-Thresholds Modelling
Abstract
A general class of models is proposed that is able to estimate the whole predictive distribution of a dependent variable $Y$ given a vector of explanatory variables $\xb$. The models exploit that the strength of explanatory variables to distinguish between low and high values of the dependent variable may vary across the thresholds that are used to define low and high. Simple linear versions of the models are generalizations of classical linear regression models but also of widely used ordinal regression models. They allow to visualize the effect of explanatory variables in the form of parameter functions. More general models are based on efficient nonparametric approaches like random forests, which are more flexible and are strong prediction tools. A general estimation method is given that can use all the estimation tools that have been proposed for binary regression, including selection methods like the lasso or elastic net. For linearly structured models maximum likelihood estimates are derived. The usefulness of the models is illustrated by simulations and several real data set.
- Publication:
-
arXiv e-prints
- Pub Date:
- March 2021
- DOI:
- arXiv:
- arXiv:2103.13324
- Bibcode:
- 2021arXiv210313324T
- Keywords:
-
- Statistics - Methodology