Predicting and Mapping MPI using Geospatial and Combined Disparate Data Sources

This seminar is organised jointly with the Institute for International Economic Policy at George Washington University and the UNDP Human Development Report Office. This seminar will be held online. Combining disparate data sources for improved poverty prediction and mapping (Neeti Pokhriyal and Damien Christophe Jacques)
Spatially finest poverty maps are essential for improved diagnosis and policy planning, especially keeping in view the Sustainable Development Goals. “Big Data” sources like call data records and satellite imagery have shown promise in providing intercensal statistics. However, most current studies are limited to using a single data source. In this talk, we will present our work describing a computational framework to efficiently combine disparate data sources, like environmental and mobile data, to provide more accurate predictions of multi-dimensional poverty for finest spatial microregions in Senegal, which is the case study country for this work. Our estimates are validated using the concurrent census data. We will also present results of this approach for estimating poverty in Haiti. Our approach can be used to estimate poverty more frequently and at high-resolution and has the potential to enhance the timeliness and reliable monitoring of SDGs. Lastly, we will include discussion about the biases and limitations of using non-traditional data sources for estimating poverty as well as the future promises of such approaches.

Predicting Poverty Using Geospatial Data in Thailand (Nattapong Puttanapong et al.)
Poverty statistics are conventionally compiled using data from household income and expenditure survey or living standards survey. This study examines an alternative approach in estimating poverty by investigating whether readily available geospatial data can accurately predict the spatial distribution of poverty in Thailand. In particular, geospatial data examined in this study include night light intensity, land cover, vegetation index, land surface temperature, built-up areas, and points of interest. The study also compares the predictive performance of various econometric and machine learning methods such as generalized least squares, neural network, random forest, and support vector regression. Results suggest that intensity of night lights and other variables that approximate population density are highly associated with the proportion of an area’s population who are living in poverty. The random forest technique yielded the highest level of prediction accuracy among the methods considered in this study, perhaps due to its capability to fit complex association structures even with small and mediumsized datasets. Moving forward, additional studies are needed to investigate whether the relationships observed here remain stable over time, and therefore, may be used to approximate the prevalence of poverty for years when household surveys on income and expenditures are not conducted, but data on geospatial correlates of poverty are available.