QuerTCI: A Tool Integrating GitHub Issue Querying with Comment Classification
Abstract
Empirical Software Engineering (ESE) researchers study (open-source) project issues and the comments and threads within to discover -- among others -- challenges developers face when incorporating new technologies, platforms, and programming language constructs. However, such threads accumulate, becoming unwieldy and hindering any insight researchers may gain. While existing approaches alleviate this burden by classifying issue thread comments, there is a gap between searching popular open-source software repositories (e.g., those on GitHub) for issues containing particular keywords and feeding the results into a classification model. This paper demonstrates a research infrastructure tool called QuerTCI that bridges this gap by integrating the GitHub issue comment search API with the classification models found in existing approaches. Using queries, ESE researchers can retrieve GitHub issues containing particular keywords, e.g., those related to a specific programming language construct, and, subsequently, classify the discussions occurring in those issues. We hope ESE researchers can use our tool to uncover challenges related to particular technologies using specific keywords through popular open-source repositories more seamlessly than previously possible. A tool demonstration video may be found at: https://youtu.be/fADKSxn0QUk.
- Publication:
-
arXiv e-prints
- Pub Date:
- February 2022
- DOI:
- arXiv:
- arXiv:2202.08761
- Bibcode:
- 2022arXiv220208761P
- Keywords:
-
- Computer Science - Software Engineering