Adjusting for Confounding with Text Matching

Publication Year
2020

Type

Journal Article
Abstract

Abstract: We identify situations in which conditioning on text can address confounding in observational studies. We argue that a matching approach is particularly well-suited to this task, but existing matching methods are ill-equipped to handle high-dimensional text data. Our proposed solution is to estimate a low-dimensional summary of the text and condition on this summary via matching. We propose a method of text matching, topical inverse regression matching, that allows the analyst to match both on the topical content of confounding documents and the probability that each of these documents is treated. We validate our approach and illustrate the importance of conditioning on text to address confounding with two applications: the effect of perceptions of author gender on citation counts in the international relations literature and the effects of censorship on Chinese social media users.

Journal
American Journal of Political Science
Volume
64
Issue
4
Pages
887-903

NB: This paper is a revised version of the manuscript formerly titled "Matching Methods for High-Dimensional Data with Applications to Text"
Blog Post. Dataverse, Software