# Non-Sparse Regularization and Efficient Training with Multiple Kernels

### http://www.eecs.berkeley.edu/Pubs/TechRpts/2010/EECS-2010-21.pdf

Learning linear combinations of multiple kernels is an appealing strategy when the right choice of features is unknown. Previous approaches to multiple kernel learning (MKL) promote sparse kernel combinations to support interpretability and scalability. Unfortunately, this $\ell_1$-norm MKL is rarely observed to outperform trivial baselines in practical applications. To allow for robust kernel mixtures, we generalize MKL to arbitrary norms. We devise new insights on the connection between several existing MKL formulations and develop two efficient interleaved optimization strategies for arbitrary norms, like lp-norms with p>1. Empirically, we demonstrate that the interleaved optimization strategies are much faster compared to the commonly used wrapper approaches. An experiment on controlled artificial data experiment sheds light on the appropriateness of sparse, non-sparse and uniformly non-sparse MKL in various scenarios. Application of lp-norm MKL to three hard real-world problems from computational biology show that non-sparse MKL achieves accuracies that go beyond the state-of-the-art. We conclude that our improvements finally made MKL fit for deployment to practical applications: MKL now has a good chance of improving the accuracy (over a plain sum kernel) at an affordable computational cost.

BibTeX citation:

@techreport{Kloft:EECS-2010-21,
Author = {Kloft, Marius and Brefeld, Ulf and Sonnenburg, Sören and Zien, Alexander},
Title = {Non-Sparse Regularization and Efficient Training with Multiple Kernels},
Institution = {EECS Department, University of California, Berkeley},
Year = {2010},
Month = {Feb},
URL = {http://www.eecs.berkeley.edu/Pubs/TechRpts/2010/EECS-2010-21.html},
Number = {UCB/EECS-2010-21},
Abstract = {Learning linear combinations of multiple kernels is an appealing strategy when the right choice of features is unknown. Previous approaches to multiple kernel learning (MKL) promote sparse kernel combinations to support interpretability and scalability. Unfortunately, this $\ell_1$-norm MKL is rarely observed to outperform trivial baselines in practical applications. To allow for robust kernel mixtures, we generalize MKL to arbitrary norms. We devise new insights on the connection between several existing MKL formulations and develop two efficient interleaved optimization strategies for arbitrary norms, like lp-norms with p>1. Empirically, we demonstrate that the interleaved optimization strategies are much faster compared to the commonly used wrapper approaches. An experiment on controlled artificial data experiment sheds light on the appropriateness of sparse, non-sparse and uniformly non-sparse MKL in various scenarios.  Application of lp-norm MKL to three hard real-world problems from computational biology show that non-sparse MKL achieves accuracies that go beyond the state-of-the-art. We conclude that our improvements finally made MKL fit for deployment to practical applications: MKL now has a good chance of improving the accuracy (over a plain sum kernel) at an affordable computational cost.}
}


EndNote citation:

%0 Report
%A Kloft, Marius
%A Brefeld, Ulf
%A Sonnenburg, Sören
%A Zien, Alexander
%T Non-Sparse Regularization and Efficient Training with Multiple Kernels
%I EECS Department, University of California, Berkeley
%D 2010
%8 February 24
%@ UCB/EECS-2010-21
%U http://www.eecs.berkeley.edu/Pubs/TechRpts/2010/EECS-2010-21.html
%F Kloft:EECS-2010-21