Clostridium difficile has rapidly emerged as the leading cause of antibiotic-associated diarrheal disease, with the transcontinental spread of various PCR ribotypes, including 001, 017, 027 and 078. However, the genetic basis for the emergence of C. difficile as a human pathogen is unclear. Whole genome sequencing was used to analyze genetic variation and virulence of a diverse collection of thirty C. difficile isolates, to determine both macro and microevolution of the species. Horizontal gene transfer and large-scale recombination of core genes has shaped the C. difficile genome over both short and long time scales. Phylogenetic analysis demonstrates C. difficile is a genetically diverse species, which has evolved within the last 1.1-85 million years. By contrast, the disease-causing isolates have arisen from multiple lineages, suggesting that virulence evolved independently in the highly epidemic lineages.