Document Clustering with K-tree
Abstract
This paper describes the approach taken to the XML Mining track at INEX 2008 by a group at the Queensland University of Technology. We introduce the K-tree clustering algorithm in an Information Retrieval context by adapting it for document clustering. Many large scale problems exist in document clustering. K-tree scales well with large inputs due to its low complexity. It offers promising results both in terms of efficiency and quality. Document classification was completed using Support Vector Machines.
- Publication:
-
Lecture Notes in Computer Science
- Pub Date:
- 2009
- DOI:
- 10.1007/978-3-642-03761-0_43
- arXiv:
- arXiv:1001.0827
- Bibcode:
- 2009LNCS.5631..420D
- Keywords:
-
- Computer Science - Information Retrieval;
- Computer Science - Artificial Intelligence;
- Computer Science - Data Structures and Algorithms
- E-Print:
- 12 pages, INEX 2008