Sanjay Chawla, Federico Girosi, Fei Wang “Data Science and the Policy Completion Problem” The link between policy analysis and data science is more delicate than it may appear. A new policy, by definition, will change the underlying data generating model, rendering classification or supervised learning inapplicable. Perhaps eliciting causal relations from observational data is the correct framework for estimating policy impact. However, there are substantial gaps between the theory, practice and feasibility of causal models. In this paper we argue that transduction, a form of inference where we reason from specific training instances to specific test instances, may provide an appropriate framework for evidence-based policy analysis. In particular, we will demonstrate that the matrix completion problem, introduced in the data science community for making predictions in recommendation systems, can be a powerful tool for both predicting and evaluating the impact of new policy changes.

Author(s): Sanjay Chawla, Federico Girosi, Fei Wang