Simpson's paradox in Covid-19 case fatality rates: a mediation analysis of age-related causal effects
We point out an instantiation of Simpson's paradox in Covid-19 case fatality rates (CFRs): comparing a large-scale study from China (17 Feb) with early reports from Italy (9 Mar), we find that CFRs are lower in Italy for every age group, but higher overall. This phenomenon is explained by a stark difference in case demographic between the two countries. Using this as a motivating example, we introduce basic concepts from mediation analysis and show how these can be used to quantify different direct and indirect effects when assuming a coarse-grained causal graph involving country, age, and case fatality. We curate an age-stratified CFR dataset with >750k cases and conduct a case study, investigating total, direct, and indirect (age-mediated) causal effects between different countries and at different points in time. This allows us to separate age-related effects from others unrelated to age and facilitates a more transparent comparison of CFRs across countries at different stages of the Covid-19 pandemic. Using longitudinal data from Italy, we discover a sign reversal of the direct causal effect in mid-March which temporally aligns with the reported collapse of the healthcare system in parts of the country. Moreover, we find that direct and indirect effects across 132 pairs of countries are only weakly correlated, suggesting that a country's policy and case demographic may be largely unrelated. We point out limitations and extensions for future work, and, finally, discuss the role of causal reasoning in the broader context of using AI to combat the Covid-19 pandemic.