A framework for reproducible computational research

by Pham, Quan Tran, Ph.D., THE UNIVERSITY OF CHICAGO, 2014, 122 pages; 3638672


In today's world of publishing, reproducing research results has become challenging as scientific research has become inherently computational. Encoding a computation-based result in a text-based paper is nearly impractical, leading to the overarching research question. “Can computation-based research papers be reproducible?”

The aim of this thesis is to describe frameworks and tools, which if provided to authors can aid in assessing computational reproducibility. Towards this aim, the thesis proposes a reproducible framework Science Object Linking and Embedding (SOLE) for creating descriptive and interactive publications by linking them with associated science objects, such as source codes, datasets, annotations, workflows, process and data provenance, and re-executable software packages. To create science objects in a linkable representation for use within research papers, the thesis describes a set of tools as part of the framework. In particular, it focuses on Provenance-To-Use (PTU), an application virtualization tool that encapsulates source code, data, and all associated data and software dependencies into a package. We describe how by capturing data dependencies , PTU allows full and partial repeatability of the virtualized software; and by capturing software dependencies, PTU can be used for building and maintaining software pipelines. Finally, we show how PTU can be used to provide computational reproducibility in a distributed environment.

We evaluate and validate the framework by applying it to several representative publications and determining the extent to which computational reproducibility is achievable.

AdvisersIan Foster; Tanu Malik
Source TypeDissertation
SubjectsComputer science
Publication Number3638672

About ProQuest Dissertations & Theses
With nearly 4 million records, the ProQuest Dissertations & Theses (PQDT) Global database is the most comprehensive collection of dissertations and theses in the world. It is the database of record for graduate research.

PQDT Global combines content from a range of the world's premier universities - from the Ivy League to the Russell Group. Of the nearly 4 million graduate works included in the database, ProQuest offers more than 2.5 million in full text formats. Of those, over 1.7 million are available in PDF format. More than 90,000 dissertations and theses are added to the database each year.

If you have questions, please feel free to visit the ProQuest Web site - http://www.proquest.com - or contact ProQuest Support.