Inv-SENnet: Invariant Self Expression Network for clustering under biased data

doi:10.48550/arXiv.2211.06780

Inv-SENnet: Invariant Self Expression Network for clustering under biased data

Subspace clustering algorithms are used for understanding the cluster structure that explains the dataset well. These methods are extensively used for data-exploration tasks in various areas of Natural Sciences. However, most of these methods fail to handle unwanted biases in datasets. For datasets where a data sample represents multiple attributes, naively applying any clustering approach can result in undesired output. To this end, we propose a novel framework for jointly removing unwanted attributes (biases) while learning to cluster data points in individual subspaces. Assuming we have information about the bias, we regularize the clustering method by adversarially learning to minimize the mutual information between the data and the unwanted attributes. Our experimental result on synthetic and real-world datasets demonstrate the effectiveness of our approach.

Publication:

arXiv e-prints

Pub Date:

November 2022

DOI:

10.48550/arXiv.2211.06780

arXiv:

arXiv:2211.06780

Bibcode:

2022arXiv221106780S

Keywords:

Computer Science - Machine Learning;
Computer Science - Computer Vision and Pattern Recognition

NASA/ADS

Inv-SENnet: Invariant Self Expression Network for clustering under biased data

Abstract