Mathematical Methods for Mining in Massive Data Sets
Abstract
With the advent of higher bandwidth and faster computers, distributed data sets in the petabyte range are being collected. The problem of obtaining information quickly from such data bases requires new and improved mathematical methods. Parallel computation and scaling issues are important areas of research. Techniques such as decision trees, vector-space methods, bayesian and neural nets have been utilized. A short desciption of some successful methods and the problems to which they have been applied will be presented.
- Publication:
-
Astrophysics and Algorithms
- Pub Date:
- 1998
- Bibcode:
- 1998asal.confE...9K