Understanding Civil War Violence through Military Intelligence: Mining Civilian Targeting Records from the Vietnam War
Military intelligence is underutilized in the study of civil war violence. Declassified records are hard to acquire and difficult to explore with the standard econometrics toolbox. I investigate a contemporary government database of civilians targeted during the Vietnam War. The data are detailed, with up to 45 attributes recorded for 73,712 individual civilian suspects. I employ an unsupervised machine learning approach of cleaning, variable selection, dimensionality reduction, and clustering. I find support for a simplifying typology of civilian targeting that distinguishes different kinds of suspects and different kinds targeting methods. The typology is robust, successfully clustering both government actors and rebel departments into groups that mirror their known functions. The exercise highlights methods for dealing with high dimensional found conflict data. It also illustrates how aggregating measures of political violence masks a complex underlying empirical data generating process as well as a complex institutional reporting process.