The TraceLab project seeks to develop an experimental workbench for designing, constructing, and executing traceability experiments, and facilitating the rigorous evaluation of different traceability techniques. TraceLab is similar in some respects to existing tools such as Weka, MatLab, or RapidMiner, except that it is highly customized to support rigorous Software Engineering experiments as opposed to general data mining ones.
Funding: TraceLab is funded under National Science Foundation MRI-R2 Grant # CNS: 0959924 entitled "development of a Software Traceability Instrument to Facilitate and Empower Traceability Research and Technology Transfer."
Collaborators: Jane Cleland-Huang (PI), Ed Keenan, Jane Huffman Hayes, Olly Gotel, Jonathan Maletic, Denys Poshyvanyk, Adam Czauderna, Greg Leach, Yonghee Shin, Brian Berenbach, David Dolores. 2010-2013, US $2,000,000.
TraceLab includes the following features:
1. A visual environment for designing and executing experiments.
2. A component library which facilitates sharing of a wide variety of importers, pre-processors, tracers, algorithms, analyzers, etc. across the traceability community.
3. Support for writing components in a wide variety of languages including, C++, C#, and Java, and combined into a single experimental work.
4. A flexible work-flow engine which support a wide variety of typical traceability experiments.
5. Portability across multiple operating systems including Windows, Linux, and Mac OS (Mac and Linux port is in progress)
6. An intuitive user interface which enables new users to execute basic experiments without any formal training.
To use TraceLab a researcher can either retrieve an existing experiment or create a new one from scratch. Each experiment is represented as a precedence graph which determines the order in which components are executed. Components in the graph, exchange data through a "workspace". At the start of an experiment, data is loaded into the workspace using a special importer components typically from external sources (database, xml files, etc). Although data may be stored in any user-defined data structure, TraceLab provides a fairly extensive set of predefined data types, which if used, can increase plug-n-play compatibility across components. Most common one are Similarity Matrix and Artifacts Collection. A given component in the precedence graph, can use any of the data currently in the workspace. In addition, any new data elements output by that component then become available for use by other downstream components.
The following screen shot depicts a basic experiment for tracing between requirements and java code using the Vector Space Model (VSM). This experiment uses four importers for importing java code methods (target), stop words, requirements (source), and the answer set against which results are evaluated.
The imported artifacts are then preprocessed; remove unwanted characters and stopwords, to stem terms to their root forms, and to produce a dictionary of terms.
Personas: As part of the TraceLab development process, we identified likely users and summarized their primary needs through the use of Personas. Four of these personas and their anticipated usage scenarios are depicted below and represent researchers who will use TraceLab to design and execute experiments, evaluate results against benchmarked baselines, exchange components, and to train students, (ii) PhD students, who will use TraceLlab to quickly establish their experimental environments and get acclimated to traceability research, (iii) developers, who will develop the initial releases of TraceLab or help to maintain TraceLab over the long-term, and finally (iv) industry adopters who may wish to pilot various traceability components on their own projects.