Causal Interpretation of Regressions With Ranks
Abstract
In studies of educational production functions or intergenerational mobility, it is common to transform the key variables into percentile ranks. Yet, it remains unclear what the regression coefficient estimates with ranks of the outcome or the treatment. In this paper, we derive effective causal estimands for a broad class of commonly-used regression methods, including the ordinary least squares (OLS), two-stage least squares (2SLS), difference-in-differences (DiD), and regression discontinuity designs (RDD). Specifically, we introduce a novel primitive causal estimand, the Rank Average Treatment Effect (rank-ATE), and prove that it serves as the building block of the effective estimands of all the aforementioned econometrics methods. For 2SLS, DiD, and RDD, we show that direct applications to outcome ranks identify parameters that are difficult to interpret. To address this issue, we develop alternative methods to identify more interpretable causal parameters.
- Publication:
-
arXiv e-prints
- Pub Date:
- June 2024
- DOI:
- arXiv:
- arXiv:2406.05548
- Bibcode:
- 2024arXiv240605548L
- Keywords:
-
- Economics - Econometrics;
- Mathematics - Statistics Theory;
- Statistics - Methodology