# Shifting: One-Inclusion Mistake Bounds and Sample Compression

### Benjamin I. P. Rubinstein, Peter Bartlett and J. Hyam Rubinstein

###
EECS Department

University of California, Berkeley

Technical Report No. UCB/EECS-2007-86

June 25, 2007

### http://www.eecs.berkeley.edu/Pubs/TechRpts/2007/EECS-2007-86.pdf

We present new expected risk bounds for binary and multiclass prediction, and resolve several recent conjectures on sample compressibility due to Kuzmin and Warmuth. By exploiting the combinatorial structure of concept class
*F*, Haussler
*et al.* achieved a VC(
*F*)/
*n* bound for the natural one-inclusion prediction strategy. The key step in their proof is a
*d*=VC(
*F*) bound on the graph density of a subgraph of the hypercube—one-inclusion graph. The first main result of this report is a density bound of
*n*∙choose(
*n*-1,≤
*d*-1)/choose(
*n*,≤
*d*) <
*d*, which positively resolves a conjecture of Kuzmin and Warmuth relating to their unlabeled Peeling compression scheme and also leads to an improved one-inclusion mistake bound. The proof uses a new form of VC-invariant shifting and a group-theoretic symmetrization. Our second main result is an algebraic topological property of maximum classes of VC-dimension
*d* as being
*d*-contractible simplicial complexes, extending the well-known characterization that
*d*=1 maximum classes are trees. We negatively resolve a minimum degree conjecture of Kuzmin and Warmuth—the second part to a conjectured proof of correctness for Peeling—that every class has one-inclusion minimum degree at most its VC-dimension. Our final main result is a
*k*-class analogue of the
*d*/
*n* mistake bound, replacing the VC-dimension by the Pollard pseudo-dimension and the one-inclusion strategy by its natural hypergraph generalization. This result improves on known PAC-based expected risk bounds by a factor of
*O*(log
*n*) and is shown to be optimal up to a
*O*(log
*k*) factor. The combinatorial technique of shifting takes a central role in understanding the one-inclusion (hyper)graph and is a running theme throughout.

BibTeX citation:

@techreport{Rubinstein:EECS-2007-86, Author = {Rubinstein, Benjamin I. P. and Bartlett, Peter and Rubinstein, J. Hyam}, Title = {Shifting: One-Inclusion Mistake Bounds and Sample Compression}, Institution = {EECS Department, University of California, Berkeley}, Year = {2007}, Month = {Jun}, URL = {http://www.eecs.berkeley.edu/Pubs/TechRpts/2007/EECS-2007-86.html}, Number = {UCB/EECS-2007-86}, Abstract = {We present new expected risk bounds for binary and multiclass prediction, and resolve several recent conjectures on sample compressibility due to Kuzmin and Warmuth. By exploiting the combinatorial structure of concept class <i>F</i>, Haussler <i>et al.</i> achieved a VC(<i>F</i>)/<i>n</i> bound for the natural one-inclusion prediction strategy. The key step in their proof is a <i>d</i>=VC(<i>F</i>) bound on the graph density of a subgraph of the hypercube—one-inclusion graph. The first main result of this report is a density bound of <i>n</i>∙choose(<i>n</i>-1,≤<i>d</i>-1)/choose(<i>n</i>,≤<i>d</i>) < <i>d</i>, which positively resolves a conjecture of Kuzmin and Warmuth relating to their unlabeled Peeling compression scheme and also leads to an improved one-inclusion mistake bound. The proof uses a new form of VC-invariant shifting and a group-theoretic symmetrization. Our second main result is an algebraic topological property of maximum classes of VC-dimension <i>d</i> as being <i>d</i>-contractible simplicial complexes, extending the well-known characterization that <i>d</i>=1 maximum classes are trees. We negatively resolve a minimum degree conjecture of Kuzmin and Warmuth—the second part to a conjectured proof of correctness for Peeling—that every class has one-inclusion minimum degree at most its VC-dimension. Our final main result is a <i>k</i>-class analogue of the <i>d</i>/<i>n</i> mistake bound, replacing the VC-dimension by the Pollard pseudo-dimension and the one-inclusion strategy by its natural hypergraph generalization. This result improves on known PAC-based expected risk bounds by a factor of <i>O</i>(log <i>n</i>) and is shown to be optimal up to a <i>O</i>(log <i>k</i>) factor. The combinatorial technique of shifting takes a central role in understanding the one-inclusion (hyper)graph and is a running theme throughout.} }

EndNote citation:

%0 Report %A Rubinstein, Benjamin I. P. %A Bartlett, Peter %A Rubinstein, J. Hyam %T Shifting: One-Inclusion Mistake Bounds and Sample Compression %I EECS Department, University of California, Berkeley %D 2007 %8 June 25 %@ UCB/EECS-2007-86 %U http://www.eecs.berkeley.edu/Pubs/TechRpts/2007/EECS-2007-86.html %F Rubinstein:EECS-2007-86