Skip to content

fix: testground run with --wait exits with code = 0 on some failures#1347

Merged
laurentsenta merged 17 commits intomasterfrom
fix/fail-on-run-with-wait
Jun 20, 2022
Merged

fix: testground run with --wait exits with code = 0 on some failures#1347
laurentsenta merged 17 commits intomasterfrom
fix/fail-on-run-with-wait

Conversation

@laurentsenta
Copy link
Copy Markdown
Contributor

@laurentsenta laurentsenta commented Jun 6, 2022

Fixes #1349
Contributes to #1304

Context:

Testground defines a task.Task object with an anonymous Result field. The content of this field depends on the builder or runner used, for example:

  • if the task is Build => Result contains an array of artifacts (probably docker images),
  • if the task is Run, with a local_exec runner => Result contains nil,
  • if the Task is Run, with a k8s or local docker runner => Result contains a struct with different Outcomes field.

That's fine for storage / serialization (when used as a wire format), but any piece of code that "receive" a task object needs to interpret the Result field by knowing about the runners. There are hints of the code required in the daemon/tasks.go code, see related PR: #1321. It create cyclic dependencies and bugs (because not all parts of the code use the same interpretation, see issue #1349).

Later it would be nice to rethink this data model.

Tasks

  • introduce a data.DecodeTaskOutcome function, (a non-intrusive way to make sure we have ONE reusable way to compute task outcomes),
  • make sure running a job with --wait will fail with exit code != 0 if the outcome is not a success.
  • Return with an exit code 3 if the command failed during collect.
  • Update tests: now that we check for actual outcomes, most test are failing

@laurentsenta laurentsenta marked this pull request as draft June 6, 2022 16:50
@laurentsenta laurentsenta changed the title test: prepare integration test for silent failure fix: testground run with --wait exits with code = 0 on some failures Jun 6, 2022
@laurentsenta laurentsenta force-pushed the fix/fail-on-run-with-wait branch from c37835e to aed4473 Compare June 7, 2022 10:06
@laurentsenta
Copy link
Copy Markdown
Contributor Author

I'll revisit #1321 when we approve & merge this.

@laurentsenta
Copy link
Copy Markdown
Contributor Author

laurentsenta commented Jun 9, 2022

(temporarily adding the fix from #1353 to make sure the CI is passing removed now)

@laurentsenta laurentsenta force-pushed the fix/fail-on-run-with-wait branch from b2e558c to d7ed2e6 Compare June 10, 2022 08:27
@laurentsenta laurentsenta force-pushed the fix/fail-on-run-with-wait branch from d7ed2e6 to 15cc3d6 Compare June 14, 2022 15:29
@laurentsenta laurentsenta force-pushed the fix/fail-on-run-with-wait branch from 15cc3d6 to 0555a78 Compare June 14, 2022 15:31
@laurentsenta laurentsenta force-pushed the fix/fail-on-run-with-wait branch from 0555a78 to 6899d2c Compare June 14, 2022 15:51
@laurentsenta laurentsenta marked this pull request as ready for review June 14, 2022 16:00
@laurentsenta laurentsenta requested a review from brdji June 20, 2022 12:26
@laurentsenta laurentsenta merged commit ced4bb1 into master Jun 20, 2022
@laurentsenta laurentsenta deleted the fix/fail-on-run-with-wait branch June 20, 2022 14:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

A testground run might end with a status code = 0 despite the job failing (hence the need for the entrypoint.sh script)

2 participants