Skip to content

New Workflow Modules - Input Parameters and Subworkflows#1306

Merged
guerler merged 11 commits intogalaxyproject:devfrom
common-workflow-lab:subworkflows
Dec 17, 2015
Merged

New Workflow Modules - Input Parameters and Subworkflows#1306
guerler merged 11 commits intogalaxyproject:devfrom
common-workflow-lab:subworkflows

Conversation

@jmchilton
Copy link
Member

Overview

This PR contains two new workflow modules to bring in concepts required to implement CWL workflows in Galaxy - nested workflows and input parameters. I developed these in the CWL fork of Galaxy but I am confident they will be very useful for vanilla Galaxy workflows.

Process

If people generally agree to this PR - I'd like to merge e4c9ddb and 22f06b7 which have two database migrations into one before it is merged. I'm keeping them separated for now in case people object to one of these workflow modules but not the other.

Nested Workflows

  • Add a new workflow module describing subworkflows.
  • Update workflow, workflow step, and workflow invocation models to track subworkflow connections and execution.
  • Extend workflow outputs with concepts of labels (and UUIDs while I'm there) to match workflow inputs. This allow us to have something to label outputs with in the workflow editor and to reference in the format 2 workflow description language.
  • Extend workflow editor UI to allow labelling workflow outputs (and enforce that these are unique across a workflow).
  • Extend workflow invocation and progress tracking to allow invoking a subworkflow as part of another workflow invocation.
  • Extend workflow import and export code to allow a nested representation of workflows.
  • Update format 2 workflow description to allow testing nested workflows.
./run_tests.sh -api test/api/test_workflows.py:WorkflowsApiTestCase.test_run_subworkflow_simple
./run_tests.sh -api test/api/test_workflows_from_yaml.py:WorkflowsFromYamlApiTestCase.test_subworkflow_simple
./run_tests.sh -api test/api/test_workflows_from_yaml.py:WorkflowsFromYamlApiTestCase.test_outputs
nosetests test/unit/test_galaxy_mapping.py
nosetests test/unit/workflows/test_workflow_progress.py

Following screenshot demonstrates two ways to utilize workflows within another workflow:

  • the link icon preserves the subworkflow structure and treats it as a single unit
  • the copy icon copies the other workflow into the current workflow node for node

subworkflow_types

Workflow Input Parameters (Experimental)

  • Implement a input parameter module that mirrors data and collection input modules but has a type that can currently be one of text, integer, float, color, and boolean.
  • Allow connections between these and tool step inputs.
  • Extend model to support this.
  • Add new input types for format 2 workflow definitions for various types that all map to this kind of step. Typed inputs such as this match well with CWL workflow inputs.

Labeled as experimental since the UI isn't developed but someday I imagine these will be superior to just marking a tool input "Specify at Runtime" for all the same reasons input steps are superior to leaving inputs unattached.

@bgruening
Copy link
Member

@jmchilton there are a few tailing tests, especially the simple sub-workflow test is failing.

  • subworkflows are not expandable in the run workflow view
  • the name of the subwokflow in toolbox is not clickable
  • I was not able to do much with workflow inputs. There was no option to combine a input with a tool/subworkflow.

Midterm ideas:
I don't think listing workflows in the toolbar works for a large set of workflows. I imagine a overlay with a list of workflow and previews that you can drag and drop into your current workflow. This preview would be also nice to have as information (hover?) for the subworkflow module.

It would be nice to switch quickly to a subworkflow and adjust settings.

Docker Image to test:

docker run -p 8080:80 -e GALAXY_CONFIG_ENABLE_BETA_WORKFLOW_MODULES=True -i -t bgruening/galaxy-stable:jmc_workflow

@jmchilton
Copy link
Member Author

@bgruening Thanks for the initial review and Docker image.

  • Pushed out 7c459cb - which I'll eventually rebase back into the subworkflow commit (03befc6) which should fix the 4 new failing tests that aren't also failing in Improved validation of tools during workflow execution. #1302. Still no clue why those existing 6 failures occur on Jenkins - they don't fail locally for me. Update: All failing tests have been fixed the PR rebased.
  • "subworkflows are not expandable in the run workflow view" The run workflow view is on its death bed (Run workflow form #1249) so I don't intend to do any polish there.
  • "the name of the subwokflow in toolbox is not clickable" There is a little link icon next to the workflow to link the workflow in as a subworkflow. If you look at the comments in the editor mako I've already started working on a second icon for copying the clicked workflow into the workflow being edited (add all nodes at once). There will be use cases for both of these operations so I was just thinking of having separate icons instead of favoring either operation as "what" occurs when the title is clicked. I'll update PR with pictures of this if I get the second operation implemented before this is merged. Update: The PR has been updated and there are now two icons which do different things.
  • "I was not able to do much with workflow inputs. There was no option to combine a input with a tool/subworkflow." Yes, like pause steps - these require a significant overhaul to the UI to be useful from Galaxy's UI so it is labelled as "experimental" and not enabled by default. That said - they are fully functional from an API standpoint and integrated into the format 2 workflow language. An example that just defines a data and a text input and runs the random lines tool with the supplied input as the seed would be:
class: GalaxyWorkflow
inputs:
  - label: data_input
    type: data
  - label: text_input
    type: text
steps:
- label: randomlines
  tool_id: random_lines1
  state:
    num_lines: 1
    input:
      $link: data_input
    seed_source:
      seed_source_selector: set_seed
      seed:
        $link: text_input
      __current_case__: 1

Your midterm ideas are excellent. I think many of them would be more competently implemented by @guerler. Workflow previews seem on the surface challenging - maybe just exposing the description and annotation information would be more feasible. Regardless, I'll create an issue for refinements to workflow nesting.

@jmchilton
Copy link
Member Author

Updated PR with bug fixes (same fixes that fixed #1302) and adding the "copy into" functionality in addition to the "link into" functionality already included in the PR. Updated description to include a screenshot distinguishing these.

@bgruening
Copy link
Member

Image is updated:

    docker run -p 8080:80 -i -t bgruening/galaxy-stable:jmc_workflow
  • "subworkflows are not expandable in the run workflow view" The run workflow view is on its death bed (Run workflow form #1249) so I don't intend to do any polish there.

Fair enough.

  • "the name of the subwokflow in toolbox is not clickable" There is a little link icon next to the workflow to link the workflow in as a subworkflow. If you look at the comments in the editor mako I've already started working on a second icon for copying the clicked workflow into the workflow being edited (add all nodes at once). There will be use cases for both of these operations so I was just thinking of having separate icons instead of favoring either operation as "what" occurs when the title is clicked. I'll update PR with pictures of this if I get the second operation implemented before this is merged.

I saw this small symbol and everything worked great. Maybe I missed that to say, I'm too excited!!!
I still think that a user expect that the link is click able and not just the symbols to the right. This is a usability issue and does probably not belong into this PR.
What about a single link that embedds the subworkflow as one item and in the header of this item you offer the expandable option?

  • "I was not able to do much with workflow inputs. There was no option to combine a input with a tool/subworkflow." Yes, like pause steps - these require a significant overhaul to the UI to be useful from Galaxy's UI so it is labelled as "experimental" and not enabled by default. That said - they are fully functional from an API standpoint and integrated into the format 2 workflow language. An example that just defines a data and a text input and runs the random lines tool with the supplied input as the seed would be:

Awesome!

+1 from my site if the usability issues can be fixed in this release cycle.

@bgruening
Copy link
Member

@jmchilton one question regarding your Input Parameters. Is this framework flexible enough to enable text files as parameter inputs? Use case: Calculate inner mate distance of paired end reads and use this output as integer input parameter in a mapping tool.

@jmchilton
Copy link
Member Author

@bgruening Not yet, CWL has this concept of expression tools and I have been working on a pure Galaxy (you know with XML 😄) analog common-workflow-lab@96f6550 that should enable something like this. Needs some work still, I am hacking on it though.

@bgruening
Copy link
Member

Awesome you rock! Just wanted to make sure we can tackle this use case as well in your implementation/design ... but sure you thought of everything! I should have known better!

@yvanlebras
Copy link
Contributor

Thanks @jmchilton !!! Really terrific!!!!!! After some tests on the Docker image through our genocloud instance http://cloud-87.genouest.org/, it seems to work. Just few remarks. Is it possible to see the name of the subworkflow on the workflow editor ? We have encountered some save / rename issues who seem to be more related to communication lag between docker image / cloud machine / host machine than to new workflow functionalities....

@jmchilton
Copy link
Member Author

Now rebased for #1322 and #1323.

I'll work on placing some information about the original workflow in for @yvanlebras.

Does anybody agree strongly with @bgruening about clicking the workflows? I feel like I prefer the current setup with two icons over this suggestion.

@nsoranzo
Copy link
Member

Does anybody agree strongly with @bgruening about clicking the workflows? I feel like I prefer the current setup with two icons over this suggestion.

Not an usability expert, but I think that the current setup is ok because the workflow name is not underlined.

@jmchilton
Copy link
Member Author

@nsoranzo Am I going to get a line-by-line 🐦 👀 critique of this or is it too big? Is there some sort of API for requesting such reviews 😄?

@nsoranzo
Copy link
Member

@jmchilton 😆 I'm going on holiday in 24h, I doubt I'll have time to review any big PR for the next 20 days, sorry!

@jmchilton
Copy link
Member Author

@nsoranzo We pay you too much to take 20 days off - GET BACK TO WORK 🐦 👀!

jmchilton added a commit to common-workflow-lab/galaxy that referenced this pull request Dec 15, 2015
@jmchilton
Copy link
Member Author

I know a few different people who care about workflows have at least looked at and thought about the input parameter stuff and had at worst neutral comments - so I am going to go ahead and merge those database migrations.

jmchilton added a commit to common-workflow-lab/galaxy that referenced this pull request Dec 15, 2015
jmchilton and others added 7 commits December 16, 2015 15:22
 - Implement a input parameter module that mirrors data and collection input modules but has a type that can currently be one of text, integer, float, color, and boolean.
 - Allow connections between these and tool step inputs.
 - Extend model to support this.
 - Add new input types for format 2 workflow definitions for various types that all map to this kind of step. Typed inputs such as this match well with CWL workflow inputs.

Someday I imagine these will be superior to just marking a tool input "Specify at Runtime" for all the same reasons input steps are superior to leaving inputs unattached.
Details:

 - Add a new workflow module describing subworkflows.
 - Add workflow list to editor side panel - with options to link in a subworkflow module or copy the target workflow into the workflow being editted node for node.
 - Update workflow, workflow step, and workflow invocation models to track subworkflow connections and execution.
 - Extend workflow outputs with concepts of labels (and UUIDs while I'm there) to match workflow inputs. This allow us to have something to label outputs with in the workflow editor and to reference in the format 2 workflow description language.
 - Extend workflow editor UI to allow labeling workflow outputs (and enforce that these are unique across a workflow).
 - Extend workflow invocation and progress tracking to allow invoking a subworkflow as part of another workflow invocation.
 - Extend workflow import and export code to allow a nested representation of workflows.
 - Update format 2 workflow description to allow testing nested workflows.

Most relevant new and modified test cases can be run using the following commands:

```
./run_tests.sh -api test/api/test_workflows.py:WorkflowsApiTestCase.test_run_subworkflow_simple
./run_tests.sh -api test/api/test_workflows_from_yaml.py:WorkflowsFromYamlApiTestCase.test_subworkflow_simple
./run_tests.sh -api test/api/test_workflows_from_yaml.py:WorkflowsFromYamlApiTestCase.test_outputs
nosetests test/unit/test_galaxy_mapping.py
nosetests test/unit/workflows/test_workflow_progress.py
```
@guerler guerler self-assigned this Dec 16, 2015
@jmchilton
Copy link
Member Author

Picture of updated UI by @guerler that allows clicking on the workflows common-workflow-lab#13.

@guerler Is that the last of your changes, if you are read to merge I can rebase.

@martenson
Copy link
Member

TS tests run fails at

File "lib/galaxy/model/migrate/versions/0131_subworkflow_and_input_paramter_modules.py", line 28, in <module>
    Column( "parameter_value", JSONType ),
NameError: name 'JSONType' is not defined

@jmchilton
Copy link
Member Author

@martenson should be fixed by c04899a. This branch is an absolute mess, but @guerler is working on some stuff still so I don't want to rebase quite yet.

@jmchilton
Copy link
Member Author

Old messy history backed up here https://github.com/jmchilton/galaxy/tree/subworkflow_prerebase_1, I've rebased on top of the latest dev and this is now a clean history with one database migration.

@guerler
Copy link
Contributor

guerler commented Dec 17, 2015

@yvanlebras The subworkflow name is shown now in the workflow editor and can be edited.

As usual there are many more good follow up suggestions but this looks like a great step towards subworkflows to me 👍

guerler added a commit that referenced this pull request Dec 17, 2015
New Workflow Modules - Input Parameters and Subworkflows
@guerler guerler merged commit 41da9b0 into galaxyproject:dev Dec 17, 2015
@jmchilton
Copy link
Member Author

@guerler thanks for the merge and thanks for all the enhancements!

@jmchilton jmchilton deleted the subworkflows branch December 17, 2015 15:07
@jmchilton jmchilton modified the milestone: 16.01 Jan 5, 2016
jmchilton added a commit to jmchilton/galaxy that referenced this pull request Mar 18, 2019
PR galaxyproject#6925 introduced a GUI for connecting non-data (e.g. integer, boolean, color, etc..) workflow input parameter to tool input parameters (the backend for this was originally added in galaxyproject#1306). That ideally was just the beginning of work toward using such values in structured ways in workflows.

This PR extends tool output handling to allow producing of non-data parameters. These can serve as a source for non-data values in workflows the same work workflow input parameters can. To make such values more easy to produce, this PR also introduces Galaxy expression tools - mirroring functionality regularly used in CWL. These small JavaScript-based tools that consume inputs just like a regular Galaxy tool but that produce dictionary of non-data values.

I think these expressions will be maximally useful when paired with format 2 workflows once we allow users to load arbitrary tools (I make the case more in full here galaxyproject#7545 (comment)), but I outline some potential uses there as well.

Because there is always a checklist in my PR descriptions:

- Tool definition language and plumbing and datatype for expressing expressions as jobs.
- Allow connecting expression tools to parameters in workflows, will delay evaluation of workflow so calculated value
- Example test expression tools for testing and demonstration.
jmchilton added a commit to jmchilton/galaxy that referenced this pull request Mar 19, 2019
PR galaxyproject#6925 introduced a GUI for connecting non-data (e.g. integer, boolean, color, etc..) workflow input parameter to tool input parameters (the backend for this was originally added in galaxyproject#1306). That ideally was just the beginning of work toward using such values in structured ways in workflows.

This PR extends tool output handling to allow producing of non-data parameters. These can serve as a source for non-data values in workflows the same work workflow input parameters can. To make such values more easy to produce, this PR also introduces Galaxy expression tools - mirroring functionality regularly used in CWL. These small JavaScript-based tools that consume inputs just like a regular Galaxy tool but that produce dictionary of non-data values.

I think these expressions will be maximally useful when paired with format 2 workflows once we allow users to load arbitrary tools (I make the case more in full here galaxyproject#7545 (comment)), but I outline some potential uses there as well.

Because there is always a checklist in my PR descriptions:

- Tool definition language and plumbing and datatype for expressing expressions as jobs.
- Allow connecting expression tools to parameters in workflows, will delay evaluation of workflow so calculated value
- Example test expression tools for testing and demonstration.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants