with Stephan Dietrich and Stafford Nichols

Abstract: Earth observation data has become an increasingly valuable resource for social scientists. However, when combining it with survey data, statistical inference is highly sensitive to a variety of methodological decisions that often go unexamined. Modifications to factors like spatial scale or temporal coverage can substantially alter conclusions, yet these details rarely discussed or even made explicit. To illustrate, we reduce freely available land cover grids to the interview locations reported by a panel survey to calculate household-level exposure to deforestation across ten African countries. We construct 108 slightly different deforestation metrics by varying spatial and temporal domains around each interview location and time, and imposing a threshold on the pixel-level forest cover difference that yields our deforestation indicator. Subsequently, we regress respondents’ subjective well-being scores on these metrics. Although our models differ only in how the spatial data were processed, they seemingly support a wide range of conclusions. In a meta-analysis, we disentangle the informative model variation from noise to show the relative influence of each of the three parameters we varied and how consistent their influence is across models. In closing, we draw some lessons for using spatial data when modeling complex social outcomes.