Supporting Scientific Collaboration through Workflows and Provenance
2010 (English)Licentiate thesis, monograph (Other academic)
Science is changing. Computers, fast communication, and new technologies have created new ways of conducting research. For instance, researchers from different disciplines are processing and analyzing scientific data that is increasing at an exponential rate. This kind of research requires that the scientists have access to tools that can handle huge amounts of data, enable access to vast computational resources, and support the collaboration of large teams of scientists. This thesis focuses on tools that help support scientific collaboration.
Workflows and provenance are two concepts that have proven useful in supporting scientific collaboration. Workflows provide a formal specification of scientific experiments, and provenance offers a model for documenting data and process dependencies. Together, they enable the creation of tools that can support collaboration through the whole scientific life-cycle, from specification of experiments to validation of results. However, existing models for workflows and provenance are often specific to particular tasks and tools. This makes it hard to analyze the history of data that has been generated over several application areas by different tools. Moreover, workflow design is a time-consuming process and often requires extensive knowledge of the tools involved and collaboration with researchers with different expertise. This thesis addresses these problems.
Our first contribution is a study of the differences between two approaches to interoperability between provenance models: direct data conversion, and mediation. We perform a case study where we integrate three different provenance models using the mediation approach, and show the advantages compared to data conversion. Our second contribution serves to support workflow design by allowing multiple users to concurrently design workflows. Current workflow tools lack the ability for users to work simultaneously on the same workflow. We propose a method that uses the provenance of workflow evolution to enable real-time collaborative design of workflows. Our third contribution considers supporting workflow design by reusing existing workflows. Workflow collections for reuse are available, but more efficient methods for generating summaries of search results are still needed. We explore new summarization strategies that considers the workflow structure.
Place, publisher, year, edition, pages
Linköping: Linköping University Electronic Press , 2010. , 70 p.
Linköping Studies in Science and Technology. Thesis, ISSN 0280-7971 ; 1427
Scientific collaboration, workflow, provenance, search engine, query language, data integration
National CategoryComputer Science
IdentifiersURN: urn:nbn:se:liu:diva-52017Local ID: LiU–Tek–Lic–2009:35ISBN: 978-91-7393-461-9OAI: oai:DiVA.org:liu-52017DiVA: diva2:278820
2010-01-15, Visionen, Hus B, Campus Valla, Linköpings universitet, Linköping, 15:00 (English)
Missier, Paolo, Ph.D.
Shahmehri, Nahid, Professor