Computational AstroStatistics: Fast and Efficient Tools for Analysing Huge Astronomical Data Sources
Abstract
I present here a review of past and present multi-disciplinary research of the Pittsburgh Computational AstroStatistics (PiCA) group. This group is dedicated to developing fast and efficient statistical algorithms for analysing huge astronomical data sources. I begin with a short review of multi-resolutional kd-trees which are the building blocks for many of our algorithms. For example, quick range queries and fast n-point correlation functions. I will present new results from the use of Mixture Models (Connolly et al. 2000) in density estimation of multi-color data from the Sloan Digital Sky Survey (SDSS). Specifically, the selection of quasars and the automated identification of X-ray sources. I will also present a brief overview of the False Discovery Rate (FDR) procedure (Miller et al. 2001a) and show how it has been used in the detection of ``Baryon Wiggles'' in the local galaxy power spectrum and source identification in radio data. Finally, I will look forward to new research on an automated Bayes Network anomaly detector and the possible use of the Locally Linear Embedding algorithm (LLE; Roweis & Saul 2000) for spectral classification of SDSS spectra.
- Publication:
-
arXiv e-prints
- Pub Date:
- October 2001
- DOI:
- 10.48550/arXiv.astro-ph/0110230
- arXiv:
- arXiv:astro-ph/0110230
- Bibcode:
- 2001astro.ph.10230N
- Keywords:
-
- Astrophysics
- E-Print:
- Invited talk at "Statistical Challenges in Modern Astronomy III" July 18-21 2001. 9 pages