Skip to content

Something changed in 1.30 so Java application memory usage drastically changed behaviour? #127844

@Aracki

Description

@Aracki

What happened?

The memory usage is observed with container_memory_working_set_bytes.

Before 1.30:

image

After upgrading to 1.30:

image

We haven't changed anything related to Kafka configuration in the meantime. Version used: quay.io/strimzi/kafka:0.33.0-kafka-3.2.0.

The problem with new behaviour is that sometimes we can get NodeHasInsufficientMemory which means more time for Kafka to recover.

The change in behaviour is present in other Java applications like Cassandra as well.

What did you expect to happen?

I'd expect memory to fill the cache and stay near the limit like before 1.30.

How can we reproduce it (as minimally and precisely as possible)?

One can run Kafka cluster in 1.29 and 1.30. Kafka will always fill all the memory it can (either up to memory limit or node memory limit).

You will see pattern of clearing cache memory in 1.30.

Anything else we need to know?

This is happening with multiple Kafka clusters. Some of those are running in cgroup v1 nodes, but some in cgroup v2.

Kubernetes version

v1.30.5-gke.1014000

Cloud provider

1.30.5-gke.1014000

OS version

NAME="Container-Optimized OS"
ID=cos
PRETTY_NAME="Container-Optimized OS from Google"
HOME_URL="https://cloud.google.com/container-optimized-os/docs"
BUG_REPORT_URL="https://cloud.google.com/container-optimized-os/docs/resources/support-policy#contact_us"
GOOGLE_METRICS_PRODUCT_ID=26
KERNEL_COMMIT_ID=395e8b40dd8bc3fe97fa563ffa370c25bd1da560
GOOGLE_CRASH_ID=Lakitu
VERSION=113
VERSION_ID=113
BUILD_ID=18244.151.27

$ uname -a

Linux 6.1.100+ #1 SMP PREEMPT_DYNAMIC Sat Aug 24 16:19:44 UTC 2024 x86_64 AMD EPYC 7B13 AuthenticAMD GNU/Linux

Metadata

Metadata

Assignees

No one assigned

    Labels

    kind/bugCategorizes issue or PR as related to a bug.needs-triageIndicates an issue or PR lacks a `triage/foo` label and requires one.sig/nodeCategorizes an issue or PR as relevant to SIG Node.

    Type

    No type
    No fields configured for issues without a type.

    Projects

    Status
    Done

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions