Horseshoes and Dichotomies: Finding the hidden variables

Susan Holmes

Stanford University

Abstract

Classical multidimensional scaling (MDS) is a method for visualizing high-dimensional point clouds by mapping to low-dimensional Euclidean space. This mapping is defined in terms of eigenfunctions of a matrix of interpoint dissimilarities. In this paper we analyze in detail multidimensional scaling applied to a specific dataset: the 2005 United States House of Representatives roll call votes. MDS and kernel projections output `horseshoes' that are characteristic of dimensionality reduction techniques. We show that in general, a latent ordering of the data gives rise to these patterns when one only has local information. That is, when only the interpoint distances for nearby points are known accurately. Our results provide a rigorous set of results and insight into manifold learning in the special case where the manifold is a curve, or two curves. We have further questions about using extra information to supplement this analysis.

This is joint work with Persi Diaconis and Sharad Goel.