-
Notifications
You must be signed in to change notification settings - Fork 245
planning for CWLProv in toil-cwl-runner #2390
Description
-
Refactor
CWLJob.run()to return(outputs, metadata)instead of justoutputs.metadatais a dictionary that will contain the information we need for generating CWLProv. -
Propagate the metadata through the
.run()calls to the root of the computation -
Try to reuse Toil's Jobstore ID's (See Accessing the Jobstoreid corresponding to a job #2449) for each
CWLJobrecord this ID and the parent ID. -
Fill metadata with a data structure containing runtime information about the tasks (tree or dict, with the keys being the jobstore IDs)
-
Generate a
ProvenanceProfileper task and aResearchObjectwhen all the metadata has been gathered. -
Refactor
cwltool/provenance.pyso that recorded time and time of recording are decoupled. -
Refactor
ProvenanceProfile:prospective_provout of the class to be the function that creates all theProvenanceProfiles and relates them in a tree-like structure. -
Refactor
cwltool/provenance.pyso that we can defer file movements until the end of the run -
Update Toil to use cwltool with the fixes (Update cwltool version to the latest #2469)
Most of the progress is found on https://github.com/DataBiosphere/toil/tree/wip-prov
┆Issue is synchronized with this Jira Story
┆Issue Number: TOIL-280