Christophe Gaillac: Partially Linear Models under Data Combination (with Xavier D'Haultfoeuille and Arnaud Maurel)

We consider the identification and inference on a partially linear model, in a data combination environment where researchers have access to two different datasets that can not be matched. This problem arises frequently in various fields in economics, including consumption, education and health. In situations where all of the regressors are available in one of the datasets, which includes two-sample 2SLS as a special case, we use recent results from optimal transport to derive a constructive characterization of the sharp identified set. We build on this result to develop a tractable inference method that is shown to perform well in finite samples. In situations where the regressors are not jointly observed, we propose a tractable characterization of an outer identified set. We show that this set can be quite informative in practice, coincides with the sharp identified set when the distributions are normal, and is amenable to inference using existing tools from the moment inequality literature. Finally, we apply our methodology to study multigenerational income mobility using data from the PSID.