You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This PR fixes an issue with fork inside fork. Avoids the duplication of outermost tasks when fork is not nested in a join inside a fork. Previously this was not being properly handled because there was no way to check whether outermost tasks where the same as downstream tasks.
The rational is when fork is inside fork without join wrapping, outermost tasks will be the same as downstream tasks and thus should not be added to lineage twice. On the other hand when join is wrapping the inner fork, outermost tasks (after outer fork) are different from downstream tasks (after inner fork) and thus both should be added to lineage.
To achieve the comparison of these two arrays (outermostTasks and downstreamTasks) I have made two hashes for each by adding the individual hashes of each task within each one of these arrays, where these hashes are made with task.info.operationCreator. So, when all tasks inside both arrays have the same operationCreator it will return the same hash, otherwise they will be different.
Then was just a matter of checking if both hashes are equal or not.
Tests for junction inside fork and fork inside fork (either wrapped in join or not) were added to check the shape of graph. This was performed by counting the length of the graphson array for vertices and for edges. I am aware that this is far from ideal but it is a quick way to check if something is broken in the near future. These tests must be re-worked after refactoring the pipeline object to something that can be run before execution.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR fixes an issue with fork inside fork. Avoids the duplication of outermost tasks when fork is not nested in a join inside a fork. Previously this was not being properly handled because there was no way to check whether outermost tasks where the same as downstream tasks.
The rational is when fork is inside fork without join wrapping, outermost tasks will be the same as downstream tasks and thus should not be added to
lineagetwice. On the other hand when join is wrapping the inner fork, outermost tasks (after outer fork) are different from downstream tasks (after inner fork) and thus both should be added tolineage.To achieve the comparison of these two arrays (
outermostTasksanddownstreamTasks) I have made two hashes for each by adding the individual hashes of each task within each one of these arrays, where these hashes are made withtask.info.operationCreator. So, when all tasks inside both arrays have the sameoperationCreatorit will return the same hash, otherwise they will be different.Then was just a matter of checking if both hashes are equal or not.