Fast Kernel Learning using Sequential Minimal Optimization

Francis R. Bach, Gert R. G. Lanckriet and Michael I. Jordan

EECS Department
University of California, Berkeley
Technical Report No. UCB/CSD-04-1307
February 2004

http://www2.eecs.berkeley.edu/Pubs/TechRpts/2004/CSD-04-1307.pdf

While classical kernel-based classifiers are based on a single kernel, in practice it is often desirable to base classifiers on combinations of multiple kernels. Lanckriet et al. (2004) considered conic combinations of kernel matrices for the support vector machine (SVM), and showed that the optimization of the coefficients of such a combination reduces to a convex optimization problem known as a quadratically-constrained quadratic program (QCQP). Unfortunately, current convex optimization toolboxes can solve this problem only for a small number of kernels and a small number of data points; moreover, the sequential minimal optimization (SMO) techniques that are essential in large-scale implementations of the SVM cannot be applied because the cost function is non-differentiable. We propose a novel dual formulation of the QCQP as a second-order cone programming problem, and show how to exploit the technique of Moreau-Yosida regularization to yield a formulation to which SMO techniques can be applied. We present experimental results that show that our SMO-based algorithm is significantly more efficient than the general-purpose interior point methods available in current optimization toolboxes.


BibTeX citation:

@techreport{Bach:CSD-04-1307,
    Author = {Bach, Francis R. and Lanckriet, Gert R. G. and Jordan, Michael I.},
    Title = {Fast Kernel Learning using Sequential Minimal Optimization},
    Institution = {EECS Department, University of California, Berkeley},
    Year = {2004},
    Month = {Feb},
    URL = {http://www2.eecs.berkeley.edu/Pubs/TechRpts/2004/5372.html},
    Number = {UCB/CSD-04-1307},
    Abstract = {While classical kernel-based classifiers are based on a single kernel, in practice it is often desirable to base classifiers on combinations of multiple kernels. Lanckriet et al. (2004) considered conic combinations of kernel matrices for the support vector machine (SVM), and showed that the optimization of the coefficients of such a combination reduces to a convex optimization problem known as a quadratically-constrained quadratic program (QCQP). Unfortunately, current convex optimization toolboxes can solve this problem only for a small number of kernels and a small number of data points; moreover, the sequential minimal optimization (SMO) techniques that are essential in large-scale implementations of the SVM cannot be applied because the cost function is non-differentiable. We propose a novel dual formulation of the QCQP as a second-order cone programming problem, and show how to exploit the technique of Moreau-Yosida regularization to yield a formulation to which SMO techniques can be applied. We present experimental results that show that our SMO-based algorithm is significantly more efficient than the general-purpose interior point methods available in current optimization toolboxes.}
}

EndNote citation:

%0 Report
%A Bach, Francis R.
%A Lanckriet, Gert R. G.
%A Jordan, Michael I.
%T Fast Kernel Learning using Sequential Minimal Optimization
%I EECS Department, University of California, Berkeley
%D 2004
%@ UCB/CSD-04-1307
%U http://www2.eecs.berkeley.edu/Pubs/TechRpts/2004/5372.html
%F Bach:CSD-04-1307