{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,3,19]],"date-time":"2025-03-19T12:31:30Z","timestamp":1742387490435,"version":"3.38.0"},"reference-count":36,"publisher":"China Science Publishing & Media Ltd.","issue":"2","license":[{"start":{"date-parts":[[2022,3,7]],"date-time":"2022-03-07T00:00:00Z","timestamp":1646611200000},"content-version":"vor","delay-in-days":65,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":["direct.mit.edu"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2022,4,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>We present a set of configurable Web service and interactive tools, s-ProvFlow, for managing and exploiting records tracking data lineage during workflow runs. It facilitates detailed analysis of single executions. It helps users manage complex tasks by exposing the relationships between data, people, equipment and workflow runs intended to combine productively. Its logical model extends the PROV standard to precisely record parallel data-streaming applications. Its metadata handling encourages users to capture the application context by specifying how application attributes, often using standard vocabularies, should be added. These metadata records immediately help productivity as the interactive tools support their use in selection and bulk operations. Users rapidly appreciate the power of the encoded semantics as they reap the benefits. This improves the quality of provenance for users and management. Which in turn facilitates analysis of collections of runs, enabling users to manage results and validate procedures. It fosters reuse of data and methods and facilitates diagnostic investigations and optimisations. We present S-ProvFlow's use by scientists, research engineers and managers as part of the DARE hyper-platform as they create, validate and use their data-driven scientific workflows.<\/jats:p>","DOI":"10.1162\/dint_a_00128","type":"journal-article","created":{"date-parts":[[2022,3,7]],"date-time":"2022-03-07T18:06:41Z","timestamp":1646676401000},"page":"226-242","update-policy":"https:\/\/doi.org\/10.1162\/mitpressjournals.corrections.policy","source":"Crossref","is-referenced-by-count":1,"title":["S-ProvFlow. Storing and Exploring Lineage Data as a\n                    Service"],"prefix":"10.3724","volume":"4","author":[{"given":"Alessandro","family":"Spinuso","sequence":"first","affiliation":[{"name":"Koninklijk Nederlands Meteorologisch Instituut, De Bilt, Utrecht 3731 GA, The Netherlands"}]},{"given":"Malcolm","family":"Atkinson","sequence":"additional","affiliation":[{"name":"University of Edinburgh, Edinburgh, Edinburgh EH8 9AB, United Kingdom"}]},{"given":"Federica","family":"Magnoni","sequence":"additional","affiliation":[{"name":"Istituto Nazionale Geofisica e Vulcanologia, Rome, Lazio 00143, Italy"}]}],"member":"2026","published-online":{"date-parts":[[2022,4,1]]},"reference":[{"key":"2022042714424239300_ref1","first-page":"485","volume-title":"Towards sustainable curation and preservation: The sead\n                        project's data services approach","author":"Myers","year":"2015"},{"volume-title":"A vast machine: Computer models, climate data, and the politics of\n                        global warming","year":"2010","author":"Edwards","key":"2022042714424239300_ref2"},{"key":"2022042714424239300_ref3","first-page":"560","volume-title":"Active provenance for data-intensive workflows: Engaging users and\n                        developers","author":"Spinuso","year":"2019"},{"volume-title":"Active provenance for data intensive research","author":"Spinuso","key":"2022042714424239300_ref4"},{"key":"2022042714424239300_ref5","first-page":"454","volume-title":"dispel4py: An agile framework for data-intensive escience","author":"Filgueira","year":"2015"},{"key":"2022042714424239300_ref6","first-page":"9","volume-title":"dispel4py: A python framework for data-intensive scientific\n                        computing","author":"Filguiera","year":"2014"},{"volume-title":"Dare: A reflective platform designed to enable agile data-driven\n                        research on the cloud","year":"2019","author":"Klampanos","key":"2022042714424239300_ref7"},{"key":"2022042714424239300_ref8","doi-asserted-by":"crossref","DOI":"10.1109\/eScience.2019.00042","volume-title":"Comprehensible control for researchers and developers facing data\n                        challenges","author":"Atkinson","year":"2019"},{"key":"2022042714424239300_ref9","doi-asserted-by":"crossref","first-page":"99","DOI":"10.1007\/s00799-007-0018-5","article-title":"Provenance explorer\u2014A graphical interface for\n                        constructing scientific publication packages from provenance\n                        trails","volume":"7","author":"Hunter","year":"2007","journal-title":"International Journal on Digital\n                        Libraries"},{"issue":"2","key":"2022042714424239300_ref10","doi-asserted-by":"crossref","first-page":"28","DOI":"10.2218\/ijdc.v9i2.332","article-title":"The PBase scientific workflow provenance\n                        repository","volume":"9","author":"Cuevas-Vicentt\u00edn","year":"2014","journal-title":"International Journal of Digital\n                        Curation"},{"issue":"12","key":"2022042714424239300_ref11","doi-asserted-by":"crossref","first-page":"2476","DOI":"10.1109\/TVCG.2013.155","article-title":"Evaluation of filesystem provenance visualization\n                        tools","volume":"19","author":"Borkin","year":"2013","journal-title":"IEEE Transactions on Visualization and\n                        Computer Graphics"},{"key":"2022042714424239300_ref12","first-page":"275","volume-title":"Provstore: A public provenance repository","author":"Huynh","year":"2014"},{"volume-title":"MongoDB Document-oriented Data Base (2020)","key":"2022042714424239300_ref13"},{"key":"2022042714424239300_ref14","first-page":"1","volume-title":"How much domain data should be in provenance databases? In:\n                        Proceedings of the 7th USENIX Conference on Theory and Practice of\n                        Provenance (TaPP' 15), pp","author":"De\n                                Oliveira","year":"2015"},{"key":"2022042714424239300_ref15","doi-asserted-by":"crossref","first-page":"351","DOI":"10.1007\/s10619-012-7104-4","article-title":"MTCProv: A practical provenance query framework for\n                        many-task scientific computing","volume":"30","author":"Gadelha","year":"2012","journal-title":"Distributed Parallel\n                        Databases"},{"first-page":"20","volume-title":"Provenance map orbiter: Interactive exploration of large provenance\n                        graphs","author":"Seltzer","key":"2022042714424239300_ref16"},{"key":"2022042714424239300_ref17","doi-asserted-by":"crossref","first-page":"215","DOI":"10.1007\/978-3-319-16462-5_18","article-title":"Prov-O-Viz\u2014understanding the role of activities in\n                        provenance.","volume-title":"Provenance and Annotation of Data and Processes","author":"Hoekstra","year":"2015"},{"issue":"5","key":"2022042714424239300_ref18","doi-asserted-by":"crossref","first-page":"759","DOI":"10.1109\/TVCG.2009.23","article-title":"A survey of radial methods for information\n                        visualization","volume":"15","author":"Draper","year":"2009","journal-title":"IEEE Transactions on Visualization\n                        and Computer Graphics"},{"key":"2022042714424239300_ref19","first-page":"308","volume-title":"A visual network analysis method for large-scale parallel I\/O\n                        systems","author":"Sigovan","year":"2013"},{"volume-title":"Provenance-aware storage systems","author":"PASS","key":"2022042714424239300_ref20"},{"key":"2022042714424239300_ref21","first-page":"43","volume-title":"Provenance-aware storage systems","author":"Muniswamy-Reddy","year":"2006"},{"issue":"5","key":"2022042714424239300_ref22","doi-asserted-by":"crossref","first-page":"741","DOI":"10.1109\/TVCG.2006.147","article-title":"Hierarchical edge bundles: Visualization of adjacency\n                        relations in hierarchical data","volume":"12","author":"Holten","year":"2006","journal-title":"IEEE Transactions on\n                        Visualization and Computer Graphics"},{"volume-title":"ProvStore, provenance storage and distribution","key":"2022042714424239300_ref23"},{"key":"2022042714424239300_ref24","doi-asserted-by":"crossref","first-page":"854","DOI":"10.1016\/j.future.2017.12.029","article-title":"Computing environments for reproducibility: Capturing the\n                        \u201cwhole tale\u201d","volume":"94","author":"Brinckman","year":"2019","journal-title":"Future Generation\n                        Computer Systems"},{"volume-title":"D3.4\n                        data lineage services ii (2020)","author":"Spinuso","key":"2022042714424239300_ref25"},{"volume-title":"Dare architecture and technology D2.2 (Version 1) (2020)","author":"Malcolm","key":"2022042714424239300_ref26"},{"volume-title":"D6.3 pilot tools and services, execution and evaluation report I\n                        (2019)","author":"Magnoni","key":"2022042714424239300_ref27"},{"volume-title":"D6.4 pilot tools and services, execution and evaluation report II\n                        (2020)","author":"Magnoni","key":"2022042714424239300_ref28"},{"issue":"2","key":"2022042714424239300_ref29","doi-asserted-by":"crossref","first-page":"792","DOI":"10.1093\/gji\/ggw173","article-title":"Uncertainty estimations for moment tensor inversions: The\n                        issue of the 2012 May 20 Emilia earthquake","volume":"206","author":"Scognamiglio","year":"2016","journal-title":"Geophysical Journal International"},{"issue":"3","key":"2022042714424239300_ref30","doi-asserted-by":"crossref","first-page":"1739","DOI":"10.1093\/gji\/ggw356","article-title":"Global adjoint tomography: First-generation\n                        model","volume":"207","author":"Bozda\u011f","year":"2016","journal-title":"Geophysical Journal International"},{"issue":"24","key":"2022042714424239300_ref31","doi-asserted-by":"crossref","first-page":"L24304","DOI":"10.1029\/2011GL049750","article-title":"Variations of crustal elastic properties during the 2009\n                        L'Aquila earthquake inferred from cross-correlations of ambient\n                        seismic noise","volume":"38","author":"Zaccarelli","year":"2011","journal-title":"Geophysical Research Letters"},{"issue":"3","key":"2022042714424239300_ref32","doi-asserted-by":"crossref","first-page":"1726","DOI":"10.1785\/0220200409","article-title":"The Italian node of the European integrated data\n                        archive","volume":"92","author":"Danecek","year":"2021","journal-title":"Seismological Research Letters"},{"volume-title":"INSTANCE\u2014The Italian seismic dataset for machine\n                        learning","author":"Michelini","key":"2022042714424239300_ref33","doi-asserted-by":"crossref","DOI":"10.5194\/essd-13-5509-2021"},{"issue":"2","key":"2022042714424239300_ref34","article-title":"FAIR digital objects for science: From data pieces to\n                        actionable knowledge units","volume":"8","author":"De\n                                Smedt","year":"2020","journal-title":"Publication"},{"issue":"1","key":"2022042714424239300_ref35","doi-asserted-by":"crossref","first-page":"159","DOI":"10.1177\/1094342017704893","article-title":"The future of scientific workflows","volume":"32","author":"Deelman","year":"2018","journal-title":"The International Journal of High Performance Computing\n                        Applications"},{"issue":"2","key":"2022042714424239300_ref36","doi-asserted-by":"crossref","first-page":"243","DOI":"10.1162\/dint_a_00129","article-title":"SWIRRL: Managing provenance-aware and reproducible\n                        workspaces","volume":"4","author":"Spinuso","year":"2022","journal-title":"Data Intelligence"}],"container-title":["Data Intelligence"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/direct.mit.edu\/dint\/article-pdf\/4\/2\/226\/2012448\/dint_a_00128.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/direct.mit.edu\/dint\/article-pdf\/4\/2\/226\/2012448\/dint_a_00128.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,3,14]],"date-time":"2025-03-14T07:41:34Z","timestamp":1741938094000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.sciengine.com\/doi\/10.1162\/dint_a_00128"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022]]},"references-count":36,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2022,4,1]]}},"URL":"https:\/\/doi.org\/10.1162\/dint_a_00128","relation":{},"ISSN":["2641-435X"],"issn-type":[{"type":"electronic","value":"2641-435X"}],"subject":[],"published-other":{"date-parts":[[2022]]},"published":{"date-parts":[[2022]]}}}