Support Vector Machines and Kd-tree for Separating Quasars from Large Survey Databases
Abstract
We compare the performance of two automated classification algorithms: k-dimensional tree (kd-tree) and support vector machines (SVMs), to separate quasars from stars in the databases of the Sloan Digital Sky Survey (SDSS) and the Two Micron All Sky Survey (2MASS) catalogs. The two algorithms are trained on subsets of SDSS and 2MASS objects whose nature is known via spectroscopy. We choose different attribute combination as input patterns to train the classifier using photometric data only and present the classification results obtained by these two methods. Performance metrics such as precision and recall, true positive rate and true negative rate, F-measure, G-mean and Weighted Accuracy are computed to evaluate the performance of the two algorithms. The study shows that both kd-tree and SVMs are effective automated algorithms to classify point sources. SVMs show slightly higher accuracy, but kd-tree requires less computation time. Given different input patterns based on various parameters(e.g. magnitudes, color information), we conclude that both kd-tree and SVMs show better performance with fewer features. What is more, our results also indicate that the accuracy using the four colors (u-g, g-r, r-i, i-z) and r magnitude based on SDSS model magnitudes adds up to the highest value. The classifiers trained by kd-tree and SVMs can be used to solve the automated classification problems faced by the virtual observatory (VO); moreover, they all can be applied for the photometric preselection of quasar candidates for large survey projects in order to optimize the efficiency of telescopes.
- Publication:
-
arXiv e-prints
- Pub Date:
- February 2008
- DOI:
- arXiv:
- arXiv:0802.0537
- Bibcode:
- 2008arXiv0802.0537D
- Keywords:
-
- Astrophysics
- E-Print:
- 11 pages, 4 figures, 8 tables. accepted for publication in MNRAS