Skip to content

E2E test random fail because container OOM #475

@kangclzjc

Description

@kangclzjc

What happened?

In some e2e tests, container could be killed because OOM. And after tracking that in some of our test, we use alpine-slim, and no start command in workload yaml. Because this image will create many number of nginx worker based on the CPU cores of the node. In my L20 machine, it will create nearly 130 nginx worker for each test pod. The pod will keep crash and the node always become Unready. In our CI/CD e2e test, I also see the same failure.

What did you expect to happen?

e2e test won't fail due to node Unready and Pod OOM killed.

Environment

  • Kubernetes version
  • Grove version
  • Scheduler details
  • Cloud provider or hardware configuration
  • Tools that you are using Grove together with
  • Anything else that is relevant

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions