Automatic Tools for Enhancing the Collaborative Experience in Large Projects
Abstract
With the explosion of big data in many fields, the efficient management of knowledge about all aspects of the data analysis gains in importance. A key feature of collaboration in large scale projects is keeping a log of what is being done and how - for private use, reuse, and for sharing selected parts with collaborators and peers, often distributed geographically on an increasingly global scale. Even better if the log is automatically created on the fly while the scientist or software developer is working in a habitual way, without the need for extra efforts. This saves time and enables a team to do more with the same resources. The CODESH - COllaborative DEvelopment SHell - and CAVES - Collaborative Analysis Versioning Environment System projects address this problem in a novel way. They build on the concepts of virtual states and transitions to enhance the collaborative experience by providing automatic persistent virtual logbooks. CAVES is designed for sessions of distributed data analysis using the popular ROOT framework, while CODESH generalizes the approach for any type of work on the command line in typical UNIX shells like bash or tcsh. Repositories of sessions can be configured dynamically to record and make available the knowledge accumulated in the course of a scientific or software endeavor. Access can be controlled to define logbooks of private sessions or sessions shared within or between collaborating groups. A typical use case is building working scalable systems for analysis of Petascale volumes of data as encountered in the LHC experiments. Our approach is general enough to find applications in many fields.
- Publication:
-
Journal of Physics Conference Series
- Pub Date:
- June 2014
- DOI:
- 10.1088/1742-6596/513/6/062007
- Bibcode:
- 2014JPhCS.513f2007B