This challenge bridges the gap between data mining/machine learning and causal discovery. Several datasets drawn from real data, or emulating real data, are provided, with the goal of making predictions under "manipulations". The setting is very similar to a usual machine learning setting: We have a training set and a test set; a target variable, whose values are concealed in test data, must be predicted. But, the test data are not distributed like the training data: some variables in test data are "manipulated" by an external agent, i.e. set to given values instead of being drawn from the "natural" distribution.
Such problems are encountered in many application domains: In medicine to predict the effect of a new treatment, in economy or ecology to predict the consequences of new issued policies, in marketing to predict customer response to marketing campaigns. Feature selection researchers should be particularly interested in that challenge. The problems posed by the challenge require finding subsets of predictive variables, taking into account whether such variables remain predictive when manipulations are performed. We anticipate that this should require the knowledge of causal relationships between variables since acting on causes of the target may result in a response change while acting on consequences should not.
Deadline April 30, 2008 http://www.causality.inf.ethz.ch/challenge.php