## CS 294: Social and Information Networks - Theory and Practice

## Overview

The proliferation and growing importance of networked data and systems in our life, as well as in various scientific disciplines, has sparked a flurry of research in developing tools for modeling and analysis of such data, whether it be in the social, technological or biological settings. This course is based on recent research in the analysis of structure, evolution and dynamics of information of such networks. Topics include probabilistic models for network formation, spectral algorithms, models and algorithms for viral propagation, and applications of sampling and sketching techniques. Prerequisites for the course include a basic knowledge of probability, introductory knowledge of algorithms, graphs and linear algebra. Course work will involve two reaction papers, two to three programming assignments with provided data and a final project.

## Course Work

- Reaction Papers - Your project will naturally fall into a major topic discussed in this course. We'd like you to pick 2 additional topics and study each a bit more in depth. For each, please write a 2-5 page reaction paper about the direction the work has taken so far and what the interesting open questions are.
- Final Project - The project is fairly open ended. You may work in groups of up to three students. We expect initial proposals by the end of the third week of classes.
- Experimental Assignments - One of the great things about social network research is the existence of real datasets. Each of the experimental assignments will focus on evaluating the theoretical results and models studied in class on the data. The programming will not be difficult.

## Topics

#### Background

- Vannevar Bush As we may think. July 1945, Atlantic magazine.
- Chapter 1-2, 13 of Networks, Crowds, and Markets by Easley and Kleinberg.
- J. Kleinberg. The convergence of social and technological networks. CACM, 2008.
- D. Lazer et al. Computational Social Science. Science 2009.

#### Small world networks experiments and models

- Main readings:
- Jon Kleinberg survey on small world models.
- Chapter 20 of Easley-Kleinberg.

- Supplementary material
- Jure Leskovec, Eric Horvitz, Planetary-Scale Views on an Instant-Messaging Network.
- P. Killworth and H. Bernard, Reverse small world experiment. Social Networks 1(1978).
- J. Kleinfeld. Could it be a Big World After All? The `Six Degrees of Separation' Myth. Society, April 2002.
- Peter Sheridan Dodds, Roby Muhamad, Duncan J. Watts. An Experimental Study of Search in Global Social Networks. Science 301(2003), 827.
- Oskar Sandberg and Ian Clarke. The Evolution of Navigable Small-World Networks. arxiv cs.DS/0607025, July 2006.

#### Small worlds networks in P2P networks and decentralized search

- Main readings
- H. Balakrishnan, M.F. Kaashoek, D. Karger, R. Morris, and I. Stoica. Looking up data in P2P systems. Communications of the ACM 46:43-48, February 2003.
- E-K Lua, J. Crowcroft, M. Pias, R. Sharma and S. Lim. A Survey and Comparison of Peer-to-Peer Overlay Network Schemes. IEEE Communications Surveys and Tutorials, 7(2005).

- Supplementary material:
- S. Lattanzi, D. Sivakumar Milgram routing in social networks. WWW 2012.
- D. Liben-Nowell, J. Novak, R. Kumar, P. Raghavan, A. Tomkins. Geographic routing in social networks. Proc. Natl. Acad. Sci. USA, 102(Aug 2005).
- A. Goel, H. Zhang, and R. Govindan. Using the Small World Model to Improve Freenet Performance. Proc. IEEE Infocom (2002).
- D. Eppstein, M. T. Goodrich, M. Loffler, D. Strash, L. Trott. Category-Based Routing in Social Networks: Membership Dimension and the Small-World Phenomenon

#### Models of networks

Random models for networks; preferential attachment, copying models; how to fit models to data;- Main readings
- A. Bonato. A survey of models of the web graph.
- R. Kumar, P. Raghavan, S. Rajagopalan, D. Sivakumar, A. Tomkins, and E. Upfal. Stochastic models for the Web graph. FOCS 2000.

- Supplementary
- R. Kumar, J. Novak, A. Tomkins Structure and evolution of online social networks, KDD 2006.
- W. Aiello, F. Chung, L. Lu. Random Evolution in Massive Graphs.
- J. Leskovec, D. Chakrabarti, J. Kleinberg, C. Faloutsos, and Z. Ghahramani. Kronecker graphs: An approach to modeling networks, 2009.
- J. Leskovec, J. Kleinberg, C. Faloutsos. Graphs over Time: Densification Laws, Shrinking Diameters and Possible Explanations. Proc. 11th ACM SIGKDD Intl. Conf. on Knowledge Discovery and Data Mining, 2005.
- I. Bezakova, A. Kalai, and R. Santhanam. Graph model selection using maximum likelihood. In Proceedings of the 23rd International Conference on Machine Learning, volume 148 of ACM International Conference Proceeding Series, pages 105{112. ACM Press, New York, 2006.
- Kevin J. Lang: Information Theoretic Comparison of Stochastic Graph Models: Some Experiments. WAW 2009.
- D. R. Hunter, S. M. Goodreau, and M. S. Handcock. Goodness of Fit of social network models. Journal of the American Statistical Association, 103(481):248{258, 2008.
- S. Lattanzi, D. Sivakumar Affiliation Networks. STOC 2009.
- J.Leskovec, L. Backstrom, R. Kumar, A. Tomkins. Microscopic evolution of social networks. KDD 2008.
- Z. Bar-Yossef, A.Z. Broder, R. Kumar, A. Tomkins. Sic Transit Gloria Telae: Towards an understanding of the web's decay. WWW 2004.

#### Long tails

Issues in fitting power-laws; models for heavy tails; power-law vs. lognormal distributions;- Main readings
- Chapter 18 of Easley-Kleinberg.
- M. Mitzenmacher A Brief History of Generative Models for Power Law and Lognormal Distributions. Internet Mathematics, vol 1, No. 2, pp. 226-251, 2004.

- Supplementary
- M. Faloutsos, P. Faloutsos, C. Faloutsos. On Power-Law Relationships of the Internet Topology. ACM SIGCOMM 1999.
- A. Clauset, C. Shalizi and M.E.J. Newman Power-Law Distributions in Empirical Data. SIAM Review 51(4), 661-703 (2009). (arXiv:0706.1062, doi:10.1137/070710111)
- Aaron Clauset's resource page on heavy tails.
- D. Achlioptas, A. Clauset, D. Kempe, C. Moore On the bias of traceroute sampling: or, power-law degree distributions in regular graph, STOC 2005.
- W. Willinger, D. Alderson, and J. C. Doyle. Mathematics and the internet: A source of enormous confusion and great potential. Notices of the American Mathematical Society, 56(5):586-599, 2009.
- S. Goel, A. Broder, E. Gabrilovich and B Pang. The Anatomy of the Long Tail, WSDM 2011.
- M. Mitzenmacher. The Future of Power Law Research.
- A. Fabrikant, E. Koutsoupias, C. Papadimitriou. Heuristically Optimized Trade-offs: A New Paradigm for Power Laws in the Internet. 29th International Colloquium on Automata, Languages, and Programming (ICALP), 2002.

#### Clustering Methods on Networks

Random walk; local random walks; spectral algorithms on networks; modularity; problems with modularity definitions; analysis of community structure;- Main readings
- Fortunato Community Detection in Graphs
- Dan Spielman. Chapter on spectral graph theory.
- L. Lovasz. Random Walks on Graphs: A Survey. Combinatorics: Paul Erdos is Eighty (vol. 2), 1996, pp. 353-398.
- M.E.J. Newman. Modularity and community structure in networks., Proc. Natl. Acad. Sci., 2002.

- Spectral
- Purnamrita Sarkar and Andrew W. Moore. Random Walks in Social Networks and their Applications: A Survey.
- R. Andersen, F. Chung, K. Lang. Local graph partitioning using pagerank vectors. In Proc. FOCS, 2006.
- J. Leskovec, K. Lang, M. Mahoney. Empirical Comparison of Algorithms for Network Community Detection. In Proc. WWW, 2010.
- J. Leskovec, K. Lang, A. Dasgupta, M. Mahoney. Community Structure in Large Networks: Natural Cluster Sizes and the Absence of Large Well-Defined Clusters. Internet Mathematics, 2009.
- S. Fortunato, S. Barthelemy. Resolution limit in community detection. Proc. Natl. Acad. Sci., 2007.

- Modularity, Trawling
- R. Kumar, P. Raghavan, S. Rajagopalan, A. Tomkins. Trawling the web for emerging cyber-communities. In Proc. WWW, 1999

- Overlapping Communities
- S. Arora, R. Ge, S. Sachdeva, G. Schoenebeck, Finding Overlapping Communities in Social Networks: Toward a Rigorous Approach

#### Cascading behavior and viral propagation

Diffusion models; influence propagation & contagion; how to identify influence based diffusion vs. homophily effects.- Main readings
- J. Kleinberg. Cascading behavior on networks. Book chapter.
- Chapter 19 and Chapter 21 of Easley-Kleinberg.

- Diffusion of information
- Eytan Bakshy, Itamar Rosenn, Cameron Marlow, Lada A. Adamic. The Role of Social Networks in Information Diffusion, WWW'12.
- A. Montanari, A. Saberi, The Spread of Innovations in Social Networks, in PNAS.
- M. Granovetter. Threshold models of collective behavior. American Journal of Sociology 83(6):1420-1443, 1978.

- Spread of influence; finding influential nodes.
- D. Kempe, J. Kleinberg, E. Tardos. Maximizing the Spread of Influence through a Social Network. Proc. 9th ACM SIGKDD Intl. Conf. on Knowledge Discovery and Data Mining, 2003.
- J. Leskovec, A. Krause, C. Guestrin, C. Faloutsos, J. VanBriesen, N. Glance. Cost-effective Outbreak Detection in Networks. ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (ACM KDD), 2007.

- Epidemic algorithms in networks
- Noam Berger, Christian Borgs, Jennifer T. Chayes, Amin Saberi: On the spread of viruses on the internet. SODA 2005:301-310
- C. Borgs, J. Chayes, Ganesh, and Amin Saberi. How to Distribute Antidotes to Control Epidemics. Random Structures & Algorithms 37(2), September 2010, Pages 204-222.

#### Computation on Large Graph Structures

Streaming, semi-streaming models for graphs; graph-processing in the Bulk Synchronous Model- Main readings
- A. Mcgregor Graph Mining on Streams.

- Supplementary
- On Graph problems in a semi-streaming model
- A. Das Sarma, R. J. Lipton, and D. Nanongkai. Best order streaming Model.
- Jin Ahn, Guha, and McGregor. Analyzing graph structure via linear measurements. In SODA, 2012.
- Isabelle Stanton and Gabriel Kliot Streaming Graph Partitioning for Large Distributed Graphs.
- Goel, Kapralov, Kapralova, Khanna, Single pass graph sparsification in distributed stream processing systems.
- Jin Ahn, Guha, McGregor, Graph Sketches: Sparsification, Spanners, and Subgraphs.

#### Sampling and surveying networks

Sampling networks to collect structural information; surveying social networks about subpopulations; respondent driven sampling -- using the network to collect data; conducting bucket tests for network effects.- Main readings
- Supplementary
- J. Kleinberg, L. Backstrom. Network Bucket Testing, WWW 2011.
- J. Leskovec and C. Faloutsos. Sampling from large graphs. In Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 631-636. ACM Press, New York, 2006.
- M. P. H. Stumpf, C. Wiuf, and R. M. May. Subnets of scale-free networks are not scale-free: Sampling properties of networks. Proceedings of the National Academy of Sciences, 102(12):4221{4224, 2005.
- L. Katzir, E. Liberty, O. Somekh. Framework and Algorithms for Network Bucket Testing. WWW 2012.
- M. J. Salganik and D. D. Heckathorn. Sampling and estimation in hidden populations using respondent-driven sampling. Sociological Methodology, 34:193{239, 2004.

#### Cooperation and collaboration on networks

Query incentive networks; DARPA challenge; behavioral experiments on networks.- Main readings
- J. Kleinberg, P. Raghavan. Query Incentive networks. FOCS 2005.
- Galen Pickard et al, Time Critical Social Mobilization: The DARPA Network Challenge Winning Strategy , arXiv:1008.3172.

- Supplementary
- M. Babaioff, S. Dobzinski, S. Oren, and A. Zohar. On Bitcoin and Red Balloons, EC 2012.
- E. Arcaute, A. Kirsch, R. Kumar, D. Liben-Nowell and S. Vassilvitskii. Threshold behavior in Query incentive networks.
- A. Anagnostopolous, L. Becchetti, C. Castillo, A. Gionis, S. Leonardi. Online Team Formation in Social Networks, WWW 2012.

#### Compressability

- Algorithms that work on compressed graphs
- Models
- On Compressing Social Networks, by F. Chierichetti, R. Kumar, S. Lattanzi, M. Mitzenmacher, A. Panconesi, and P. Raghavan.