Policy Learning Under Ambiguity

This paper studies the problem of estimating individualized treatment rules when treatment effects are partially identified, as it is often the case with observational data. We first study the population problem of assigning treatment under partial identification and derive the population optimal policies using classic optimality criteria for decision under ambiguity. We then propose an algorithm for computation of the estimated optimal treatment policy and provide statistical guarantees for its convergence to the population counterpart. Our estimation procedure leverages recent advances in the orthogonal machine learning literature, while our theoretical results account for the presence of non-differentiabilities in the problem. The proposed methods are illustrated using data from the Job Partnership Training Act study.