SecondOrder Kernel Online Convex Optimization with Adaptive Sketching
Abstract
Kernel online convex optimization (KOCO) is a framework combining the expressiveness of nonparametric kernel models with the regret guarantees of online learning. Firstorder KOCO methods such as functional gradient descent require only $\mathcal{O}(t)$ time and space per iteration, and, when the only information on the losses is their convexity, achieve a minimax optimal $\mathcal{O}(\sqrt{T})$ regret. Nonetheless, many common losses in kernel problems, such as squared loss, logistic loss, and squared hinge loss posses stronger curvature that can be exploited. In this case, secondorder KOCO methods achieve $\mathcal{O}(\log(\text{Det}(\boldsymbol{K})))$ regret, which we show scales as $\mathcal{O}(d_{\text{eff}}\log T)$, where $d_{\text{eff}}$ is the effective dimension of the problem and is usually much smaller than $\mathcal{O}(\sqrt{T})$. The main drawback of secondorder methods is their much higher $\mathcal{O}(t^2)$ space and time complexity. In this paper, we introduce kernel online Newton step (KONS), a new secondorder KOCO method that also achieves $\mathcal{O}(d_{\text{eff}}\log T)$ regret. To address the computational complexity of secondorder methods, we introduce a new matrix sketching algorithm for the kernel matrix $\boldsymbol{K}_t$, and show that for a chosen parameter $\gamma \leq 1$ our SketchedKONS reduces the space and time complexity by a factor of $\gamma^2$ to $\mathcal{O}(t^2\gamma^2)$ space and time per iteration, while incurring only $1/\gamma$ times more regret.
 Publication:

arXiv eprints
 Pub Date:
 June 2017
 arXiv:
 arXiv:1706.04892
 Bibcode:
 2017arXiv170604892C
 Keywords:

 Statistics  Machine Learning;
 Computer Science  Machine Learning