Computational AstroStatistics: Fast and Efficient Tools for Analysing Huge Astronomical Data Sources

doi:10.48550/arXiv.astro-ph/0110230

Computational AstroStatistics: Fast and Efficient Tools for Analysing Huge Astronomical Data Sources

I present here a review of past and present multi-disciplinary research of the Pittsburgh Computational AstroStatistics (PiCA) group. This group is dedicated to developing fast and efficient statistical algorithms for analysing huge astronomical data sources. I begin with a short review of multi-resolutional kd-trees which are the building blocks for many of our algorithms. For example, quick range queries and fast n-point correlation functions. I will present new results from the use of Mixture Models (Connolly et al. 2000) in density estimation of multi-color data from the Sloan Digital Sky Survey (SDSS). Specifically, the selection of quasars and the automated identification of X-ray sources. I will also present a brief overview of the False Discovery Rate (FDR) procedure (Miller et al. 2001a) and show how it has been used in the detection of ``Baryon Wiggles'' in the local galaxy power spectrum and source identification in radio data. Finally, I will look forward to new research on an automated Bayes Network anomaly detector and the possible use of the Locally Linear Embedding algorithm (LLE; Roweis & Saul 2000) for spectral classification of SDSS spectra.

Publication:

arXiv e-prints

Pub Date:

October 2001

DOI:

10.48550/arXiv.astro-ph/0110230

arXiv:

arXiv:astro-ph/0110230

Bibcode:

2001astro.ph.10230N

Keywords:

Astrophysics

E-Print:

Invited talk at "Statistical Challenges in Modern Astronomy III" July 18-21 2001. 9 pages

NASA/ADS

Computational AstroStatistics: Fast and Efficient Tools for Analysing Huge Astronomical Data Sources

Abstract