Skip to content

feat(injector/cpu-pressure): revamp#704

Merged
luphaz merged 19 commits intomainfrom
matthieu.bono/CHAOS-503/cpu-pressure-fix
May 25, 2023
Merged

feat(injector/cpu-pressure): revamp#704
luphaz merged 19 commits intomainfrom
matthieu.bono/CHAOS-503/cpu-pressure-fix

Conversation

@luphaz
Copy link
Copy Markdown
Contributor

@luphaz luphaz commented Apr 20, 2023

What does this PR do?

  • Adds new functionality
  • Alters existing functionality
  • Fixes a bug
  • Improves documentation or testing

Please briefly describe your changes as well as the motivation behind them:

  • This PR aims to adapt the cpu-pressure injector to be able to react to container restarts
  • This PR also fixes couple of bugs seen along the way:
    • Exit without error
    • Stress all containers not only the last
    • Stress all cores on all containers
    • split DisruptionInjectionStatus and DisruptionTargetInjectionStatus to avoid miss understanding of which is allowed to have what.
    • add some new DisruptionInjectionStatus to avoid having a misleading PreviouslyInjected status (when it was partially injected or not injected, an expired disruption changed to the same status, it's no longer the case)
    • fixes when the previously injection status is calculated to ensure when injection pods exited successfully near the deadline we are not transitioning back to PartiallyInjected

This PR also takes the opportunity to (re)-introduce a make local rule to run the controller outside of Kubernetes.
This PR also introduce a make pre-debug that will like make local except it won't launch the binary and will let you use you favorite debugger method (IDE, dlv, ...)

This PR also review how we are running e2e-test to guarantee pods are created for each test ran and ease their duplication.
E2E-tests are hence now ran in parallel.

Code Quality Checklist

  • The documentation is up to date.
  • My code is sufficiently commented and passes continuous integration checks.
  • I have signed my commit (see Contributing Docs).

Testing

  • I leveraged continuous integration testing
    • by depending on existing unit tests or end-to-end tests.
    • by adding new unit tests or end-to-end tests.
  • I manually tested the following steps:
    • locally.
    • as a canary deployment to a cluster.

var NoSideEffectDisruptions = map[chaostypes.DisruptionKindName]struct{}{
chaostypes.DisruptionKindNodeFailure: {},
chaostypes.DisruptionKindContainerFailure: {},
chaostypes.DisruptionKindCPUPressure: {},
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We now move several processes to other CGroups, even if we put great care to kill them if anything is going wrong, we are not exempt from bug, hence we might have a side effect.

return strings.TrimSuffix(content, "\n"), nil
}

func (cg cgroup) ReadCPUSet() (cpuset.CPUSet, error) {
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's way more easier to have this here than in the caller, feels to me it's cgroup package responsibility to know such things

@luphaz luphaz force-pushed the matthieu.bono/CHAOS-503/cpu-pressure-fix branch 4 times, most recently from 5511ee5 to ff7ec0e Compare April 20, 2023 12:56
@luphaz luphaz force-pushed the matthieu.bono/CHAOS-503/upgrade-gingko-v2 branch from 7075380 to 8bb53be Compare April 20, 2023 12:58
@luphaz luphaz force-pushed the matthieu.bono/CHAOS-503/cpu-pressure-fix branch from ff7ec0e to 51825bf Compare April 20, 2023 12:59
Base automatically changed from matthieu.bono/CHAOS-503/upgrade-gingko-v2 to main April 20, 2023 13:54
@luphaz luphaz force-pushed the matthieu.bono/CHAOS-503/cpu-pressure-fix branch 4 times, most recently from 3f300f3 to d609ed5 Compare April 20, 2023 16:12
@luphaz luphaz marked this pull request as ready for review April 20, 2023 16:19
@luphaz luphaz requested a review from a team as a code owner April 20, 2023 16:19
@luphaz luphaz force-pushed the matthieu.bono/CHAOS-503/cpu-pressure-fix branch 2 times, most recently from 06afd0a to 1aea2d7 Compare April 20, 2023 18:23
@luphaz luphaz force-pushed the matthieu.bono/CHAOS-503/cpu-pressure-fix branch from e8f50d3 to 79bfb04 Compare April 20, 2023 21:18
@luphaz luphaz mentioned this pull request Apr 20, 2023
13 tasks
@luphaz luphaz force-pushed the matthieu.bono/CHAOS-503/cpu-pressure-fix branch from f933ca0 to b4ee22d Compare April 21, 2023 00:09
@luphaz luphaz force-pushed the matthieu.bono/CHAOS-503/cpu-pressure-fix branch 2 times, most recently from 6fa29d4 to 5560eed Compare May 17, 2023 07:01
@luphaz luphaz force-pushed the matthieu.bono/CHAOS-503/cpu-pressure-fix branch from 9a2f5b0 to d90d0e3 Compare May 17, 2023 11:56
@luphaz luphaz force-pushed the matthieu.bono/CHAOS-503/cpu-pressure-fix branch 3 times, most recently from 558e712 to b2c1981 Compare May 23, 2023 07:45
@datadog-datadog-prod-us1
Copy link
Copy Markdown

datadog-datadog-prod-us1 bot commented May 24, 2023

Datadog Report

Branch report: matthieu.bono/CHAOS-503/cpu-pressure-fix
Commit report: 492cfcd

chaos-controller: 0 Failed, 0 New Flaky, 409 Passed, 0 Skipped, 2m 45.32s Wall Time

@luphaz luphaz force-pushed the matthieu.bono/CHAOS-503/cpu-pressure-fix branch from b2c1981 to b65d56f Compare May 24, 2023 09:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants