Shuffled linear regression through graduated convex relaxation
Abstract
The shuffled linear regression problem aims to recover linear relationships in datasets where the correspondence between input and output is unknown. This problem arises in a wide range of applications including survey data, in which one needs to decide whether the anonymity of the responses can be preserved while uncovering significant statistical connections. In this work, we propose a novel optimization algorithm for shuffled linear regression based on a posterior-maximizing objective function assuming Gaussian noise prior. We compare and contrast our approach with existing methods on synthetic and real data. We show that our approach performs competitively while achieving empirical running-time improvements. Furthermore, we demonstrate that our algorithm is able to utilize the side information in the form of seeds, which recently came to prominence in related problems.
- Publication:
-
arXiv e-prints
- Pub Date:
- September 2022
- DOI:
- 10.48550/arXiv.2209.15608
- arXiv:
- arXiv:2209.15608
- Bibcode:
- 2022arXiv220915608O
- Keywords:
-
- Statistics - Computation;
- Computer Science - Machine Learning