A Distributed Algorithm for Training Nonlinear Kernel Machines

doi:10.48550/arXiv.1405.4543

A Distributed Algorithm for Training Nonlinear Kernel Machines

This paper concerns the distributed training of nonlinear kernel machines on Map-Reduce. We show that a re-formulation of Nyström approximation based solution which is solved using gradient based techniques is well suited for this, especially when it is necessary to work with a large number of basis points. The main advantages of this approach are: avoidance of computing the pseudo-inverse of the kernel sub-matrix corresponding to the basis points; simplicity and efficiency of the distributed part of the computations; and, friendliness to stage-wise addition of basis points. We implement the method using an AllReduce tree on Hadoop and demonstrate its value on a few large benchmark datasets.

Publication:

arXiv e-prints

Pub Date:

May 2014

DOI:

10.48550/arXiv.1405.4543

arXiv:

arXiv:1405.4543

Bibcode:

2014arXiv1405.4543M

Keywords:

Computer Science - Machine Learning

NASA/ADS

A Distributed Algorithm for Training Nonlinear Kernel Machines

Abstract