Identifying change patterns in software history
Abstract
Traditional algorithms for detecting differences in source code focus on differences between lines. As such, little can be learned about abstract changes that occur over time within a project. Structural differencing on the program's abstract syntax tree reveals changes at the syntactic level within code, which allows us to further process the differences to understand their meaning. We propose that grouping of changes by some metric of similarity, followed by pattern extraction via antiunification will allow us to identify patterns of change within a software project from the sequence of changes contained within a Version Control System (VCS). Tree similarity metrics such as a tree edit distance can be used to group changes in order to identify groupings that may represent a single class of change (e.g., adding a parameter to a function call). By applying antiunification within each group we are able to generalize from families of concrete changes to patterns of structural change. Studying patterns of change at the structural level, instead of line-by-line, allows us to gain insight into the evolution of software.
- Publication:
-
arXiv e-prints
- Pub Date:
- July 2013
- DOI:
- arXiv:
- arXiv:1307.1719
- Bibcode:
- 2013arXiv1307.1719D
- Keywords:
-
- Computer Science - Software Engineering
- E-Print:
- 7 pages, submitted to document changes 2013