Solving Multiclass Learning Problems via ErrorCorrecting Output Codes
Abstract
Multiclass learning problems involve finding a definition for an unknown function f(x) whose range is a discrete set containing k > 2 values (i.e., k ``classes''). The definition is acquired by studying collections of training examples of the form [x_i, f (x_i)]. Existing approaches to multiclass learning problems include direct application of multiclass algorithms such as the decisiontree algorithms C4.5 and CART, application of binary concept learning algorithms to learn individual binary functions for each of the k classes, and application of binary concept learning algorithms with distributed output representations. This paper compares these three approaches to a new technique in which errorcorrecting codes are employed as a distributed output representation. We show that these output representations improve the generalization performance of both C4.5 and backpropagation on a wide range of multiclass learning tasks. We also demonstrate that this approach is robust with respect to changes in the size of the training sample, the assignment of distributed representations to particular classes, and the application of overfitting avoidance techniques such as decisiontree pruning. Finally, we show thatlike the other methodsthe errorcorrecting code technique can provide reliable class probability estimates. Taken together, these results demonstrate that errorcorrecting output codes provide a generalpurpose method for improving the performance of inductive learning programs on multiclass problems.
 Publication:

arXiv eprints
 Pub Date:
 December 1994
 DOI:
 10.48550/arXiv.cs/9501101
 arXiv:
 arXiv:cs/9501101
 Bibcode:
 1995cs........1101D
 Keywords:

 Computer Science  Artificial Intelligence
 EPrint:
 See http://www.jair.org/ for any accompanying files