Support Vector Machines and Kd-tree for Separating Quasars from Large Survey Databases

doi:10.48550/arXiv.0802.0537

Support Vector Machines and Kd-tree for Separating Quasars from Large Survey Databases

We compare the performance of two automated classification algorithms: k-dimensional tree (kd-tree) and support vector machines (SVMs), to separate quasars from stars in the databases of the Sloan Digital Sky Survey (SDSS) and the Two Micron All Sky Survey (2MASS) catalogs. The two algorithms are trained on subsets of SDSS and 2MASS objects whose nature is known via spectroscopy. We choose different attribute combination as input patterns to train the classifier using photometric data only and present the classification results obtained by these two methods. Performance metrics such as precision and recall, true positive rate and true negative rate, F-measure, G-mean and Weighted Accuracy are computed to evaluate the performance of the two algorithms. The study shows that both kd-tree and SVMs are effective automated algorithms to classify point sources. SVMs show slightly higher accuracy, but kd-tree requires less computation time. Given different input patterns based on various parameters(e.g. magnitudes, color information), we conclude that both kd-tree and SVMs show better performance with fewer features. What is more, our results also indicate that the accuracy using the four colors (u-g, g-r, r-i, i-z) and r magnitude based on SDSS model magnitudes adds up to the highest value. The classifiers trained by kd-tree and SVMs can be used to solve the automated classification problems faced by the virtual observatory (VO); moreover, they all can be applied for the photometric preselection of quasar candidates for large survey projects in order to optimize the efficiency of telescopes.

Publication:

arXiv e-prints

Pub Date:

February 2008

DOI:

10.48550/arXiv.0802.0537

arXiv:

arXiv:0802.0537

Bibcode:

2008arXiv0802.0537D

Keywords:

Astrophysics

E-Print:

11 pages, 4 figures, 8 tables. accepted for publication in MNRAS

ADS

Support Vector Machines and Kd-tree for Separating Quasars from Large Survey Databases

Abstract