What happened?
In some e2e tests, container could be killed because OOM. And after tracking that in some of our test, we use alpine-slim, and no start command in workload yaml. Because this image will create many number of nginx worker based on the CPU cores of the node. In my L20 machine, it will create nearly 130 nginx worker for each test pod. The pod will keep crash and the node always become Unready. In our CI/CD e2e test, I also see the same failure.
What did you expect to happen?
e2e test won't fail due to node Unready and Pod OOM killed.
Environment
- Kubernetes version
- Grove version
- Scheduler details
- Cloud provider or hardware configuration
- Tools that you are using Grove together with
- Anything else that is relevant
What happened?
In some e2e tests, container could be killed because OOM. And after tracking that in some of our test, we use alpine-slim, and no start command in workload yaml. Because this image will create many number of nginx worker based on the CPU cores of the node. In my L20 machine, it will create nearly 130 nginx worker for each test pod. The pod will keep crash and the node always become Unready. In our CI/CD e2e test, I also see the same failure.
What did you expect to happen?
e2e test won't fail due to node Unready and Pod OOM killed.
Environment