We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

q-bio.QM

Change to browse by:

References & Citations

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Quantitative Biology > Quantitative Methods

Title: File-based localization of numerical perturbations in data analysis pipelines

Abstract: Data analysis pipelines are known to be impacted by computational conditions, presumably due to the creation and propagation of numerical errors. While this process could play a major role in the current reproducibility crisis, the precise causes of such instabilities and the path along which they propagate in pipelines are unclear. We present Spot, a tool to identify which processes in a pipeline create numerical differences when executed in different computational conditions. Spot leverages system-call interception through ReproZip to reconstruct and compare provenance graphs without pipeline instrumentation. By applying Spot to the structural pre-processing pipelines of the Human Connectome Project, we found that linear and non-linear registration are the cause of most numerical instabilities in these pipelines, which confirms previous findings.
Comments: 10 pages, 6 figures, 2 tables
Subjects: Quantitative Methods (q-bio.QM); Image and Video Processing (eess.IV)
Cite as: arXiv:2006.04684 [q-bio.QM]
  (or arXiv:2006.04684v2 [q-bio.QM] for this version)

Submission history

From: Ali Salari [view email]
[v1] Wed, 3 Jun 2020 19:11:40 GMT (770kb,D)
[v2] Tue, 29 Sep 2020 01:00:09 GMT (862kb)

Link back to: arXiv, form interface, contact.