The Unexplored Treasure Trove of Phabricator Code Review
Abstract
Phabricator is a modern code collaboration tool used by popular projects like FreeBSD and Mozilla. However, unlike the other well-known code review environments, such as Gerrit or GitHub, there is no readily accessible public code review dataset for Phabricator. This paper describes our experience mining code reviews from five different projects that use Phabricator (Blender, FreeBSD, KDE, LLVM, and Mozilla). We discuss the challenges associated with the data retrieval process and our solutions, resulting in a dataset with details regarding 317,476 Phabricator code reviews. Our dataset is available in both JSON and MySQL database dump formats. The dataset enables analyses of the history of code reviews at a more granular level than other platforms. In addition, given that the projects we mined are publicly accessible via the Conduit API, our dataset can be used as a foundation to fetch additional details and insights.
- Publication:
-
arXiv e-prints
- Pub Date:
- March 2022
- DOI:
- arXiv:
- arXiv:2203.07473
- Bibcode:
- 2022arXiv220307473K
- Keywords:
-
- Computer Science - Software Engineering
- E-Print:
- 5 pages. To be published in Proceedings of MSR '22: Proceedings of the 19th International Conference on Mining Software Repositories (MSR 2022). ACM, New York, NY, USA