Joint Support Recovery under High-Dimensional Scaling: Benefits and Perils of l1,∞-Regularization
Martin Wainwright and Sahand N Negahban
Vodafone Graduate Fellowship and NSF Group 10853
We consider the following instance of transfer learning: given a pair of regression problems, suppose that the regression coefficients share a partially common support, parameterized by the overlap fraction α between the two supports. This set-up suggests the use of l1,∞-regularized linear regression for recovering the support sets of both regression vectors, as has been proposed in past work. Our main contribution is to provide a sharp characterization of the sample complexity of this l1,∞ relaxation, exactly pinning down the minimal sample size n required for joint support recovery as a function of the model dimension p, support size s and overlap α ∈ [0,1]. For measurement matrices drawn from standard Gaussian ensembles, we prove that the joint l1,∞-regularized method undergoes a phase transition characterized by order parameter θ1,∞(n,p,s,α) = n/(4-3α)s log(p-(2-α)s). More precisely, the probability of successfully recovering both supports converges to 1 for scalings such that θ1,∞ > 1. An implication of this threshold is that use of l1,∞-regularization leads to gains in sample complexity if the overlap parameter is large enough (α > 2/3), but performs worse than a naive approach if α < 2/3. We illustrate the close agreement between these theoretical predictions, and the actual behavior in simulations. Thus, our results illustrate both the benefits and dangers associated with block-l1,∞ regularization in high-dimensional inference.