Convex Tuning of the Soft Margin Parameter

Tijl De Bie, Gert R. G. Lanckriet and Nello Cristianini

EECS Department
University of California, Berkeley
Technical Report No. UCB/CSD-03-1289
November 2003

http://www.eecs.berkeley.edu/Pubs/TechRpts/2003/CSD-03-1289.pdf

In order to deal with known limitations of the hard margin support vector machine (SVM) for binary classification -- such as overfitting and the fact that some data sets are not linearly separable --, a soft margin approach has been proposed in literature. The soft margin SVM allows training data to be misclassified to a certain extent, by introducing slack variables and penalizing the cost function with an error term, i.e., the 1-norm or 2-norm of the corresponding slack vector. A regularization parameter C trades off the importance of maximizing the margin versus minimizing the error. While the 2-norm soft margin algorithm itself is well understood, and a generalization bound is known, no computationally tractable method for tuning the soft margin parameter C has been proposed so far. In this report we present a convex way to optimize C for the 2-norm soft margin SVM, by maximizing this generalization bound. The resulting problem is a quadratically constrained quadratic programming (QCQP) problem, which can be solved in polynomial time O( l^3) with l the number of training samples.


BibTeX citation:

@techreport{De Bie:CSD-03-1289,
    Author = {De Bie, Tijl and Lanckriet, Gert R. G. and Cristianini, Nello},
    Title = {Convex Tuning of the Soft Margin Parameter},
    Institution = {EECS Department, University of California, Berkeley},
    Year = {2003},
    Month = {Nov},
    URL = {http://www.eecs.berkeley.edu/Pubs/TechRpts/2003/5696.html},
    Number = {UCB/CSD-03-1289},
    Abstract = {In order to deal with known limitations of the hard margin support vector machine (SVM) for binary classification -- such as overfitting and the fact that some data sets are not linearly separable --, a soft margin approach has been proposed in literature. The soft margin SVM allows training data to be misclassified to a certain extent, by introducing slack variables and penalizing the cost function with an error term, i.e., the 1-norm or 2-norm of the corresponding slack vector. A regularization parameter <i>C</i> trades off the importance of maximizing the margin versus minimizing the error. While the 2-norm soft margin algorithm itself is well understood, and a generalization bound is known, no computationally tractable method for tuning the soft margin parameter <i>C</i> has been proposed so far. In this report we present a convex way to optimize <i>C</i> for the 2-norm soft margin SVM, by maximizing this generalization bound. The resulting problem is a quadratically constrained quadratic programming (QCQP) problem, which can be solved in polynomial time <i>O</i>(<i>l</i>^3) with <i>l</i> the number of training samples.}
}

EndNote citation:

%0 Report
%A De Bie, Tijl
%A Lanckriet, Gert R. G.
%A Cristianini, Nello
%T Convex Tuning of the Soft Margin Parameter
%I EECS Department, University of California, Berkeley
%D 2003
%@ UCB/CSD-03-1289
%U http://www.eecs.berkeley.edu/Pubs/TechRpts/2003/5696.html
%F De Bie:CSD-03-1289