Surrogate loss functions, divergences and decentralized detection

XuanLong Nguyen

U. of California (Berkeley)


Most of the machine learning literature on detection and classification is abstracted away from considerations of an underlying communication-theoretic infrastructure, constraints from which may prevent an algorithm from aggregating all relevant data at a central site. In many real-life applications, however, resource limitations make it necessary to transmit only partial descriptions of data. Examples include sensor networks, in which each sensor operates under power or bandwidth constraints.

In this talk, I shall describe an algorithmic framework for nonparametric decentralized detection from empirical data. In constrast to classical work on classification, we need to learn both quantization rules at individual sensors and a classification rule at the central coordinator. The key ingredients of our framework are the use of surrogate convex loss functions and marginalized kernels, which result in a computationally efficient learning algorithm.

In the second part of the talk, I'll show that there is a class of surrogate convex losses for which our learning procedure is consistent, i.e., it is guaranteed to find optimal decision rules with respect to the 0-1 loss. This is due to a precise correspondence between surrogate loss functions and a class of divergence functionals known as Ali-Silvey distances or f-divergences. This correspondence has implications far beyond the decentralized detection setting. In particular, it motivates a nonparametric M-estimation method for estimating f-divergence functionals and the density ratio of two probability distributions, given only the empirical data. I shall describe theoretical properties and an empirical evaluation of this estimator.

Maintained by: Fei Sha