Attribution Required: Stack Overflow Code Snippets in GitHub Projects
Abstract
Stack Overflow (SO) is the largest Q&A website for developers, providing a huge amount of copyable code snippets. Using these snippets raises various maintenance and legal issues. The SO license requires attribution, i.e., referencing the original question or answer, and requires derived work to adopt a compatible license. While there is a heated debate on SO's license model for code snippets and the required attribution, little is known about the extent to which snippets are copied from SO without proper attribution. In this paper, we present the research design and summarized results of an empirical study analyzing attributed and unattributed usages of SO code snippets in GitHub projects. On average, 3.22% of all analyzed repositories and 7.33% of the popular ones contained a reference to SO. Further, we found that developers rather refer to the whole thread on SO than to a specific answer. For Java, at least two thirds of the copied snippets were not attributed.
- Publication:
-
arXiv e-prints
- Pub Date:
- July 2017
- DOI:
- 10.48550/arXiv.1707.00452
- arXiv:
- arXiv:1707.00452
- Bibcode:
- 2017arXiv170700452B
- Keywords:
-
- Computer Science - Software Engineering
- E-Print:
- 3 pages, 1 figure, Proceedings of the 39th International Conference on Software Engineering Companion (ICSE-C 2017), IEEE, 2017, pp. 161-163