Skip to content

Improved validation of tools during workflow execution.#1302

Merged
guerler merged 3 commits intogalaxyproject:devfrom
jmchilton:workflow_validation_1
Dec 14, 2015
Merged

Improved validation of tools during workflow execution.#1302
guerler merged 3 commits intogalaxyproject:devfrom
jmchilton:workflow_validation_1

Conversation

@jmchilton
Copy link
Member

Do a more complete validation of everything after workflow parameters have been replaced and inputs connected up.

Attempt 2 at the idea in #1284.

Lets see if the tests go any better now.

@jmchilton
Copy link
Member Author

6 failures instead of 44. Making progress.

@jmchilton
Copy link
Member Author

Ugh - none of these fail for me locally. Odd, maybe because I rebased against dev before testing - forced push and we will try again.

@jmchilton
Copy link
Member Author

Probably a problem with validate on postgres for these tools - maybe a unicode vs str thing? If that is what it is though I'm annoyed nothing would show up in the logs so I pushed out 41590e3 which should result in job creation problems from halting workflow execution (not job execute/tool failure problems, job creation problems). It also adds a lot more logging on these workflow problems. Hopefully this helps.

@jmchilton
Copy link
Member Author

Progress:

galaxy.tools.execute: WARNING: There was a failure executing a job for tool [cat_list] - {'input1': "Value no longer valid for 'Concatenate Dataset', replaced with default"}
galaxy.tools.execute: DEBUG: Executed all jobs for tool request: (42.009 ms)
galaxy.workflow.run: ERROR: Failed to schedule Workflow[id=9name=Workflow (imported from API)], problem occurred on WorkflowStep[index=4,type=tool].
Traceback (most recent call last):
  File "/galaxy/lib/galaxy/workflow/run.py", line 155, in invoke
    jobs = self._invoke_step( step )
  File "/galaxy/lib/galaxy/workflow/run.py", line 221, in _invoke_step
    jobs = step.module.execute( self.trans, self.progress, self.workflow_invocation, step )
  File "/galaxy/lib/galaxy/workflow/modules.py", line 875, in execute
    raise Exception("Failed to create %d jobs for workflow." % count)
Exception: Failed to create 1 jobs for workflow.

Will fix the logging typos and log the actual reason the validation is failing.

@jmchilton jmchilton force-pushed the workflow_validation_1 branch from 41590e3 to 918ce64 Compare December 12, 2015 21:20
@jmchilton
Copy link
Member Author

With extra logging tests discovered a fairly substantial and unfortunate bug, fixed with 918ce64.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we log the exception and raise it afterwards, like in the run.py or the other way around?

Do a more complete validation of everything after workflow parameters have been replaced and inputs connected up.

Attempt 2 at the idea in galaxyproject#1284.
@jmchilton jmchilton force-pushed the workflow_validation_1 branch from bff1785 to ce787ad Compare December 14, 2015 20:31
@guerler
Copy link
Contributor

guerler commented Dec 14, 2015

👍

guerler added a commit that referenced this pull request Dec 14, 2015
Improved validation of tools during workflow execution.
@guerler guerler merged commit 5bf926b into galaxyproject:dev Dec 14, 2015
@jmchilton jmchilton modified the milestone: 16.01 Jan 5, 2016
jmchilton added a commit to jmchilton/galaxy that referenced this pull request Jan 19, 2016
Prior to PR galaxyproject#1302 workflows wouldn't validate call validate at all on inputs. 1302 was perhaps an over correction that causes validate to be called on datasets that aren't yet READY but require metadata. This rolls that back somewhat, other parameters will still validate - but validation asserting the existence of metadata won't get called in workflows. This is still not ideal but after 1302 and this - we are still in a better, more secure place than beforehand.

One might argue that since we need to introduce a new validation point (right before a job is scheduled and set to QUEUED) - that we would be better off rolling back galaxyproject#1302 entirely. I've considered that, but I still think it is better to fail fast on the other problems, also I would rather call the validation code too many times than not enough and it would be very late in the release cycle for such a change.
jmchilton added a commit to jmchilton/galaxy that referenced this pull request Jan 19, 2016
Prior to PR galaxyproject#1302 workflows wouldn't validate call validate at all on inputs. 1302 was perhaps an over correction that causes validate to be called on datasets that aren't yet READY but require metadata. This rolls that back somewhat, other parameters will still validate - but validation asserting the existence of metadata won't get called in workflows. This is still not ideal but after 1302 and this - we are still in a better, more secure place than beforehand.

One might argue that since we need to introduce a new validation point (right before a job is scheduled and set to QUEUED) - that we would be better off rolling back galaxyproject#1302 entirely. I've considered that, but I still think it is better to fail fast on the other problems, also I would rather call the validation code too many times than not enough and it would be very late in the release cycle for such a change.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants