Class Discovery in Galaxy Classification
Abstract
In recent years, automated, supervised classification techniques have been fruitfully applied to labeling and organizing large astronomical databases. These methods require off-line classifier training, based on labeled examples from each of the (known) object classes. In practice, only a small batch of labeled examples, hand-labeled by a human expert, may be available for training. Moreover, there may be no labeled examples for some classes present in the data; i.e., the database may contain several unknown classes. Unknown classes may be present because of (1) uncertainty in or lack of knowledge of the measurement process, (2) an inability to adequately ``survey'' a massive database to assess its content (classes), and/or (3) an incomplete scientific hypothesis. In recent work, the question of new class discovery in mixed labeled/unlabeled data was formally posed, with a proposed solution based on mixture models. In this work we investigate this approach, propose a competing technique suitable for class discovery in neural networks, and evaluate methods for both classification and class discovery in several astronomical data sets. Our results demonstrate up to a 57% reduction in classification error compared to a standard neural network classifier that uses only labeled data.
- Publication:
-
The Astrophysical Journal
- Pub Date:
- January 2005
- DOI:
- 10.1086/426068
- arXiv:
- arXiv:astro-ph/0406323
- Bibcode:
- 2005ApJ...618..723B
- Keywords:
-
- Astronomical Data Bases: Miscellaneous;
- Galaxies: General;
- Methods: Data Analysis;
- Methods: Statistical;
- Astrophysics
- E-Print:
- Astrophys.J. 618 (2005) 723-732