Estimating grouped data models with a binary dependent variable and fixed effect via logit vs OLS: the impact of dropped units
Abstract
This letter deals with a very simple issue: if we have grouped data with a binary dependent variable and want to include fixed effects (group specific intercepts) in the specification, is Ordinary Least Squares (OLS) in any way superior to a logit form because the OLS method \emph{appears} to keep all observations whereas the logit drops all groups which have either all zeros or all ones on the dependent variable? It is shown that OLS averages the estimates for the all zero (and all one) groups, which by definition have all slope coefficients of zero, with the slope coefficients for the groups with a mix of zeros and ones. Thus the correct comparison of OLS to logit is to only look at groups with some variation in the dependent variable. Researchers using OLS are urged to report results both for all groups and for the subset of groups where the dependent variable varies. The interpretation of the difference between these two results depends upon assumptions which cannot be empirically assessed.
- Publication:
-
arXiv e-prints
- Pub Date:
- October 2018
- DOI:
- arXiv:
- arXiv:1810.12105
- Bibcode:
- 2018arXiv181012105B
- Keywords:
-
- Statistics - Applications
- E-Print:
- arXiv admin note: substantial text overlap with arXiv:1809.06505