Errors-In-Variables Model Fitting for Partially Unpaired Data Utilizing Mixture Models

doi:10.48550/arXiv.2406.18154

Errors-In-Variables Model Fitting for Partially Unpaired Data Utilizing Mixture Models

We introduce a general framework for regression in the errors-in-variables regime, allowing for full flexibility about the dimensionality of the data, observational error probability density types, the (nonlinear) model type and the avoidance of ad-hoc definitions of loss functions. In this framework, we introduce model fitting for partially unpaired data, i.e. for given data groups the pairing information of input and output is lost (semi-supervised). This is achieved by constructing mixture model densities, which directly model the loss of pairing information allowing inference. In a numerical simulation study linear and nonlinear model fits are illustrated as well as a real data study is presented based on life expectancy data from the world bank utilizing a multiple linear regression model. These results show that high quality model fitting is possible with partially unpaired data, which opens the possibility for new applications with unfortunate or deliberate loss of pairing information in data.

Publication:

arXiv e-prints

Pub Date:

June 2024

DOI:

10.48550/arXiv.2406.18154

arXiv:

arXiv:2406.18154

Bibcode:

2024arXiv240618154H

Keywords:

Statistics - Methodology;
Mathematics - Probability

NASA/ADS

Errors-In-Variables Model Fitting for Partially Unpaired Data Utilizing Mixture Models

Abstract