# Research Projects

## Using Canonical Correlation Analysis (CCA) for Opinion Visualization

Siamak Faridani, David Wong, Ephrat Bitton and Ken Goldberg

Opinion Space is an existing collaboration and discussion forum using techniques from deliberative polling, collaborative filtering, and multidimensional visualization [1]. The system collects opinions on statements as scalar values on a continuous scale and applies dimensionality reduction to project the data onto a two-dimensional plane for visualization and navigation. This technique effectively places all participants onto one level playing field. Points far apart correspond to participants with very different opinions, and participants with proximal points share similar opinions. Participants in Opinion Space also write a textual response to a discussion question and are encouraged to earn points through reading and rating the responses of others. In this project we use Canonical Correlation Analysis (CCA) which uses both the participant's numerical data and textual responses to calculate a user's position on the plane. The CCA algorithm finds a linear transformation for two sets of data that maximizes the correlation between the transformed sets of data. We designed a formal quantitative framework to compare the quality of the projections yielded by Principal Component Analysis (PCA) and CCA. We found that CCA had the greatest correlation between spatial distance and agreement rating of comments. Numerical results on the current dataset suggest that CCA is a more effective dimensionality reduction method for Opinion Space than PCA.

Figure 1: Multidimensional visualization of Opinion Space.

Figure 2: Table 1. Canonical Correlation Analysis provides the highest correlation between spatial distances and agreement rating values of comments.

- [1]
- S. Faridani, E. Bitton, K. Ryokai, and K. Goldberg. Opinion space: a scalable tool for browsing online comments. In Proceedings of the 28th international conference on Human factors in computing systems, pages 1175–1184. ACM, 2010.