CS 298-2
Theory Seminar

Nir Ailon
Princeton University

Aggregating Inconsistent Information: Ranking and Clustering

Monday, August 29, 2005
4pm-5pm
The Wozniak Lounge on the fourth floor of Soda Hall


A ranking of n web pages is to be chosen from outputs of k search
engines. How do we choose one ranking minimizing the "disagreement"
with the k rankings?

A clustering of n genes is to be chosen from outputs of k clustering
algorithms. How do we choose one clustering minimizing the
"disagreement" with the k clusterings?

These information aggregation problems date back to as early as 1785,
when the French philosopher Condorcet considered voting systems where
each voter chooses a full ranking of a set of candidates. Recently,
their computational aspects have been studied.

In this talk, I will describe new algorithms improving on the best known
approximation ratios for both the ranking and the clustering problems
with respect to a standard measure of disagreement. I will also discuss
related graph theoretic optimization problems, known as the minimum
feedback arc set in tournaments and the correlation clustering in
complete graphs. Additionally, I will show that the problem of finding
a minimum feedback arc set in tournaments has no poly-time algorithm,
assuming NP is not contained in BPP. This almost settles a
long-standing conjecture of Bang-Jensen and Thomassen, that the problem
is NP-hard.

This is joint work with Moses Charikar and Alantha Newman.