Evaluating Four Methods for Detecting Differential Item Functioning in Large-Scale Assessments with More Than Two Groups
Abstract
This study evaluated four multi-group differential item functioning (DIF) methods (the root mean square deviation approach, Wald-1, generalized logistic regression procedure, and generalized Mantel-Haenszel method) via Monte Carlo simulation of controlled testing conditions. These conditions varied in the number of groups, the ability and sample size of the DIF-contaminated group, the parameter associated with DIF, and the proportion of DIF items. When comparing Type-I error rates and powers of the methods, we showed that the RMSD approach yielded the best Type-I error rates when it was used with model-predicted cutoff values. Also, this approach was found to be overly conservative when used with the commonly used cutoff value of 0.1. Implications for future research for educational researchers and practitioners were discussed.
- Publication:
-
arXiv e-prints
- Pub Date:
- August 2024
- DOI:
- 10.48550/arXiv.2408.11922
- arXiv:
- arXiv:2408.11922
- Bibcode:
- 2024arXiv240811922K
- Keywords:
-
- Statistics - Applications
- E-Print:
- preprint, 16 pages (excluding figures, references, and title page)