Privacy-Preserving Support Vector Learning
Benjamin I. P. Rubinstein, Peter Bartlett, Ling Huang1 and Nina Taft2
National Science Foundation DMS-0707060 and Siebel Scholarship
Dwork et al. [1,2] proposed differential privacy as a strong guarantee for data privacy. A differentially private interactive mechanism may release a statistic provided the statistic reveals exceedingly little about individual data. Non-interactive mechanisms may release anonymized forms of the data, provided that the original data is not identifiable. Recently there has been interest in the utility of privacy-preserving non-interactive mechanisms (e.g. ): for a mechanism to be useful, statistics on the anonymized data must resemble statistics on the original data.
In this project we propose a differentially private interactive mechanism for support vector machine (SVM) learning. The mechanism releases a parametrization of a classifier trained on a privacy-sensitive dataset; a released classifier can then be used to classify subsequent test points an arbitrary number of times. We propose an analogous notion of utility for interactive mechanisms: that with high probability the mechanism's response be close to the non-private learner's response. And we prove that the proposed PrivateSVM mechanism is useful with respect to the SVM.
To achieve greater privacy, our mechanism must sacrifice some (quantified) amount of utility. This leads to the natural question of lower bounds for the privacy preserved by useful mechanisms. We derive such lower bounds for any mechanism approximating the SVM.
- Cynthia Dwork, Frank McSherry, Kobbi Nissim, Adam Smith. Calibrating Noise to Sensitivity in Private Data Analysis. In Theory of Cryptography Conference (TCC), Springer, 2006.
- Cynthia Dwork. Differential Privacy. International Colloquium on Automata, Languages and Programming (ICALP) 2006, p. 1-12.
- Avrim Blum, Katrina Ligett, Aaron Roth. A Learning Theory Approach to Non-Iteractive Database Privacy.In Proceedings of the 40th annual ACM symposium on Theory of computing, 2008
1Intel Labs Berkeley
2Intel Labs Berkeley