Skip to content

Fix conducting of cycle with a fork#170

Merged
m4dcoder merged 1 commit intomasterfrom
fix-cycle-fork
Jul 10, 2019
Merged

Fix conducting of cycle with a fork#170
m4dcoder merged 1 commit intomasterfrom
fix-cycle-fork

Conversation

@m4dcoder
Copy link
Collaborator

@m4dcoder m4dcoder commented Jul 9, 2019

If there is a branch that fork from a cycle in the workflow graph, during runtime and the workflow runs the cycle more than once, tasks under the forked branch will not execute a second time. The issue is caused by the networkx graph not recognizing the forked branch as part of the cycle. The conductor is changed to determine a task is executed more than once as part of the cycle when the last task entry is completed and the new task status is one of the starting execution statuses. Fixes #169

If there is a branch that fork from a cycle in the workflow graph, during runtime and the workflow runs the cycle more than once, tasks under the forked branch will not execute a second time. The issue is caused by the networkx graph not recognizing the forked branch as part of the cycle. The conductor is changed to determine a task is executed more than once as part of the cycle when the last task entry is completed and the new task status is one of the starting execution statuses.
@m4dcoder m4dcoder requested a review from jinpingh July 9, 2019 23:40
Copy link
Contributor

@jinpingh jinpingh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

networkx

Typo!

toil:
action: core.echo message="This is hard work."
next:
- do: query
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The workflow here is infinity loop, right?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jinpingh If you look at the workflow definition and the unit test closely, it is not an infinitely loop practically. In the unit test, I'm controlling the result that is returned by the query task which controls when the loop terminates.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Understood that unit test takes care of when the loop terminates. Just want to make sure that I understand how this fork working. :)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gotcha. You are asking the right questions.

# misses forks that extends from the cycle. The check here assumes that the
# last task entry is already completed and the new task status is one of the
# starting statuses, then there is high likelihood that this is a cycle.
if (task_state_entry.get('status') in statuses.COMPLETED_STATUSES and
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Above line 655 commented If task is already completed and in cycle, the code here did not check if task is in cycle self.graph.in_cycle(task_id).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment If task is already completed and in cycle is saying what must be checked. But if you look at the description right after that, because the method in networkx library to check cycle is too simple, here is what we are doing.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should change comment to remove cycle?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, we shouldn't change comment to remove "cycle". We want to detect whether the current task is part of a cycle. It is just that the method in networkx is unable to determine that this very specific use case with a fork stems from a cycle.

@m4dcoder
Copy link
Collaborator Author

m4dcoder commented Jul 10, 2019

@jinpingh networkx is not a typo. It's a python library for graphing. https://networkx.github.io

@jinpingh
Copy link
Contributor

@jinpingh networkx is not a typo. It's a python library for graphing.

Thanks!

Copy link
Contributor

@jinpingh jinpingh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@m4dcoder m4dcoder merged commit a8d5378 into master Jul 10, 2019
@m4dcoder m4dcoder deleted the fix-cycle-fork branch July 10, 2019 16:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Tasks in a fork from a cycle do not execute

2 participants