Skip to content

Fix flaky TestControllerGameServerCount race condition#4392

Merged
markmandel merged 2 commits intoagones-dev:mainfrom
markmandel:flaky/TestControllerGameServerCount
Dec 18, 2025
Merged

Fix flaky TestControllerGameServerCount race condition#4392
markmandel merged 2 commits intoagones-dev:mainfrom
markmandel:flaky/TestControllerGameServerCount

Conversation

@markmandel
Copy link
Copy Markdown
Collaborator

What type of PR is this?

Uncomment only one /kind <> line, press enter to put that in a new line, and remove leading whitespace from that line:

/kind breaking
/kind bug

/kind cleanup

/kind documentation
/kind feature
/kind hotfix
/kind release

What this PR does / Why we need it:

The test was flaky due to a race condition in the wait loop. The condition only checked that 4 time series existed (len(m.TimeSeries) == 4), but did not verify the actual count values. This allowed the test to proceed even when only 1 of 2 PortAllocation GameServers had been added to the informer cache.

The fix updates the wait condition to explicitly verify that the PortAllocation metric has a count of 2 before proceeding to assertions. This ensures both GameServers are present in the cache and counted correctly, eliminating the race condition.

Which issue(s) this PR fixes:

Closes #4389

Special notes for your reviewer:

@markmandel markmandel added the area/tests Unit tests, e2e tests, anything to make sure things don't break label Dec 18, 2025
@github-actions github-actions bot added kind/cleanup Refactoring code, fixing up documentation, etc size/S labels Dec 18, 2025
The test was flaky due to a race condition in the wait loop. The
condition only checked that 4 time series existed (len(m.TimeSeries) ==
4), but did not verify the actual count values. This allowed the test to
proceed even when only 1 of 2 PortAllocation GameServers had been added
to the informer cache.

The fix updates the wait condition to explicitly verify that the
PortAllocation metric has a count of 2 before proceeding to assertions.
This ensures both GameServers are present in the cache and counted
correctly, eliminating the race condition.

Closes agones-dev#4389
@markmandel markmandel force-pushed the flaky/TestControllerGameServerCount branch from 7b4fe73 to d9d9f4b Compare December 18, 2025 04:50
@agones-bot
Copy link
Copy Markdown
Collaborator

Build Succeeded 🥳

Build Id: b10926fc-124a-4876-acf1-1d5e9bd450fc

The following development artifacts have been built, and will exist for the next 30 days:

A preview of the website (the last 30 builds are retained):

To install this version:

git fetch https://github.com/googleforgames/agones.git pull/4392/head:pr_4392 && git checkout pr_4392
helm install agones ./install/helm/agones --namespace agones-system --set agones.image.registry=us-docker.pkg.dev/agones-images/ci --set agones.image.tag=1.55.0-dev-d9d9f4b

Copy link
Copy Markdown
Collaborator

@vicentefb vicentefb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@markmandel markmandel enabled auto-merge (squash) December 18, 2025 16:05
@agones-bot
Copy link
Copy Markdown
Collaborator

Build Failed 😭

Build Id: 5c2ab7cd-f872-4c18-b3c9-1a232f44ea98

Status: FAILURE

To get permission to view the Cloud Build view, join the agones-discuss Google Group.

@markmandel
Copy link
Copy Markdown
Collaborator Author

/gcbrun

@agones-bot
Copy link
Copy Markdown
Collaborator

Build Failed 😭

Build Id: 9057fb67-573f-4fbe-a082-c62de9ef3c1b

Status: FAILURE

To get permission to view the Cloud Build view, join the agones-discuss Google Group.

@markmandel
Copy link
Copy Markdown
Collaborator Author

/gcbrun

@agones-bot
Copy link
Copy Markdown
Collaborator

Build Succeeded 🥳

Build Id: a1872f61-e958-421a-b643-4ff15cd2ec56

The following development artifacts have been built, and will exist for the next 30 days:

A preview of the website (the last 30 builds are retained):

To install this version:

git fetch https://github.com/googleforgames/agones.git pull/4392/head:pr_4392 && git checkout pr_4392
helm install agones ./install/helm/agones --namespace agones-system --set agones.image.registry=us-docker.pkg.dev/agones-images/ci --set agones.image.tag=1.55.0-dev-c5b9341

@markmandel markmandel merged commit 109a4c4 into agones-dev:main Dec 18, 2025
4 checks passed
@markmandel markmandel deleted the flaky/TestControllerGameServerCount branch December 18, 2025 19:59
mnthe pushed a commit to mnthe/agones that referenced this pull request Mar 23, 2026
The test was flaky due to a race condition in the wait loop. The
condition only checked that 4 time series existed (len(m.TimeSeries) ==
4), but did not verify the actual count values. This allowed the test to
proceed even when only 1 of 2 PortAllocation GameServers had been added
to the informer cache.

The fix updates the wait condition to explicitly verify that the
PortAllocation metric has a count of 2 before proceeding to assertions.
This ensures both GameServers are present in the cache and counted
correctly, eliminating the race condition.

Closes agones-dev#4389
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/tests Unit tests, e2e tests, anything to make sure things don't break kind/cleanup Refactoring code, fixing up documentation, etc size/S

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Flaky Unit Test: TestControllerGameServerCount

3 participants