Skip to content

Implement Wasm autoscaler policy controller logic#4299

Merged
markmandel merged 1 commit intoagones-dev:mainfrom
markmandel:feature/wasm-controller-logic
Oct 15, 2025
Merged

Implement Wasm autoscaler policy controller logic#4299
markmandel merged 1 commit intoagones-dev:mainfrom
markmandel:feature/wasm-controller-logic

Conversation

@markmandel
Copy link
Copy Markdown
Collaborator

What type of PR is this?

Uncomment only one /kind <> line, press enter to put that in a new line, and remove leading whitespace from that line:

/kind breaking
/kind bug
/kind cleanup
/kind documentation

/kind feature

/kind hotfix
/kind release

What this PR does / Why we need it:

Add controller implementation for Wasm-based fleet autoscaling:

  • Implement applyWasmPolicy function to execute Wasm modules for autoscaling decisions
  • Add Wasm plugin lifecycle management with proper initialization and cleanup
  • Integrate Extism SDK for Wasm module loading and execution
  • Support plugin configuration and hash verification
  • Add FleetAutoscaleReview request/response handling for Wasm plugins
  • Include comprehensive unit tests for Wasm policy functionality
  • Add vendor dependencies: Extism SDK, Wazero runtime, Observe SDK, and OpenTelemetry proto

Which issue(s) this PR fixes:

Work on #4080

Special notes for your reviewer:

Next will be e2e tests with the feature flags moved to alpha -- then also docs!

@markmandel markmandel added the area/user-experience Pertaining to developers trying to use Agones, e.g. SDK, installation, etc label Oct 6, 2025
@github-actions github-actions bot added kind/feature New features for Agones size/XL labels Oct 6, 2025
@github-actions
Copy link
Copy Markdown

github-actions bot commented Oct 6, 2025

This PR exceeds the recommended size of 1000 lines. Please make sure you are NOT addressing multiple issues with one PR. Note this PR might be rejected due to its size.

@agones-bot
Copy link
Copy Markdown
Collaborator

Build Succeeded 🥳

Build Id: d7ad11ea-1a1e-4649-84e0-80930bb9fab5

The following development artifacts have been built, and will exist for the next 30 days:

A preview of the website (the last 30 builds are retained):

To install this version:

git fetch https://github.com/googleforgames/agones.git pull/4299/head:pr_4299 && git checkout pr_4299
helm install agones ./install/helm/agones --namespace agones-system --set agones.image.registry=us-docker.pkg.dev/agones-images/ci --set agones.image.tag=1.53.0-dev-bf2e78d

Copy link
Copy Markdown
Collaborator

@lacroixthomas lacroixthomas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking great !

I'm not sure about the "limited" part, if there should also be something to prevent going over a max value ? Or we expect that they handle it from their wasm ? 🤔

(also, the joy of the vendors hehe)

@lacroixthomas
Copy link
Copy Markdown
Collaborator

Looking great !

I'm not sure about the "limited" part, if there should also be something to prevent going over a max value ? Or we expect that they handle it from their wasm ? 🤔

(also, the joy of the vendors hehe)

Actually about the limited part, they could use another policy 👌🏼

Add controller implementation for Wasm-based fleet autoscaling:
- Implement applyWasmPolicy function to execute Wasm modules for
  autoscaling decisions
- Add Wasm plugin lifecycle management with proper initialization and
  cleanup
- Integrate Extism SDK for Wasm module loading and execution
- Support plugin configuration and hash verification
- Add FleetAutoscaleReview request/response handling for Wasm plugins
- Include comprehensive unit tests for Wasm policy functionality
- Add vendor dependencies: Extism SDK, Wazero runtime, Observe SDK, and
  OpenTelemetry proto

Work on agones-dev#4080
@markmandel
Copy link
Copy Markdown
Collaborator Author

Looking great !
I'm not sure about the "limited" part, if there should also be something to prevent going over a max value ? Or we expect that they handle it from their wasm ? 🤔
(also, the joy of the vendors hehe)

Actually about the limited part, they could use another policy 👌🏼

I think I know what you're saying -- but just to be sure. The wasm policy matches the webhook autoscaler policy, in that there's no specified maximum Agones is aware of (but there could be behind the scenes from the webhook or wasm -- that's up that implementation).

Before I merge, does this line up with your expectations and understanding?

@markmandel markmandel force-pushed the feature/wasm-controller-logic branch from bf2e78d to af20e92 Compare October 15, 2025 15:42
@github-actions
Copy link
Copy Markdown

This PR exceeds the recommended size of 1000 lines. Please make sure you are NOT addressing multiple issues with one PR. Note this PR might be rejected due to its size.

@agones-bot
Copy link
Copy Markdown
Collaborator

Build Failed 😭

Build Id: 923d9cdd-38fd-442a-b0a0-600f72abe8b7

Status: FAILURE

To get permission to view the Cloud Build view, join the agones-discuss Google Group.

@lacroixthomas
Copy link
Copy Markdown
Collaborator

Looking great !
I'm not sure about the "limited" part, if there should also be something to prevent going over a max value ? Or we expect that they handle it from their wasm ? 🤔
(also, the joy of the vendors hehe)

Actually about the limited part, they could use another policy 👌🏼

I think I know what you're saying -- but just to be sure. The wasm policy matches the webhook autoscaler policy, in that there's no specified maximum Agones is aware of (but there could be behind the scenes from the webhook or wasm -- that's up that implementation).

Before I merge, does this line up with your expectations and understanding?

Yep all good 👌🏼

@markmandel
Copy link
Copy Markdown
Collaborator Author

Huh this one again 🤔

us-docker.pkg.dev/agones-images/ci
installing current release
# if IMAGE_PULL_SECRET_FILE is specified, create the agones-system namespace and install the secret
bash -c '[[ $(helm status agones -n agones-system --output json | jq -r ".info.status") =~ (failed|pending-.*) ]] && helm uninstall agones --namespace=agones-system || true'
\
	helm upgrade --install --atomic --wait --timeout 10m --namespace=agones-system \
	--create-namespace \
	--set agones.image.tag=1.53.0-dev-af20e92,agones.image.registry="us-docker.pkg.dev/agones-images/ci" \
	--set agones.image.controller.pullPolicy="Always",agones.image.controller.pullSecret= \
	--set agones.image.extensions.pullPolicy="Always",agones.image.allocator.pullPolicy="Always" \
	--set agones.image.ping.pullPolicy="Always",agones.image.sdk.alwaysPull=true \
	--set agones.ping.http.serviceType="LoadBalancer",agones.ping.udp.serviceType="LoadBalancer" \
	--set agones.allocator.service.serviceType="LoadBalancer" \
	--set agones.controller.logLevel="debug" \
	--set agones.crds.cleanupOnDelete=true \
	--set agones.featureGates="" \
	--set agones.allocator.service.loadBalancerIP=35.190.155.91 \
	--set agones.metrics.serviceMonitor.enabled=false \
	 \
	agones /go/src/agones.dev/agones/install/helm/agones/
Error: UPGRADE FAILED: release agones failed, and has been rolled back due to atomic being set: could not get information about the resource: an error on the server ("Internal Server Error: \"/api/v1/namespaces/agones-system/services/agones-ping-http-service\": the server is currently unable to handle the request") has prevented the request from succeeding (get services agones-ping-http-service)
make: *** [Makefile:450: install] Error 1

Maybe there's a cluster update happening.

@markmandel
Copy link
Copy Markdown
Collaborator Author

/gcbrun

@agones-bot
Copy link
Copy Markdown
Collaborator

Build Succeeded 🥳

Build Id: 941a5b53-9676-4058-8c15-dcf163a4d7e2

The following development artifacts have been built, and will exist for the next 30 days:

A preview of the website (the last 30 builds are retained):

To install this version:

git fetch https://github.com/googleforgames/agones.git pull/4299/head:pr_4299 && git checkout pr_4299
helm install agones ./install/helm/agones --namespace agones-system --set agones.image.registry=us-docker.pkg.dev/agones-images/ci --set agones.image.tag=1.53.0-dev-af20e92

@markmandel markmandel merged commit d40e29c into agones-dev:main Oct 15, 2025
4 checks passed
@markmandel markmandel deleted the feature/wasm-controller-logic branch October 15, 2025 20:35
mnthe pushed a commit to mnthe/agones that referenced this pull request Mar 23, 2026
Add controller implementation for Wasm-based fleet autoscaling:
- Implement applyWasmPolicy function to execute Wasm modules for
  autoscaling decisions
- Add Wasm plugin lifecycle management with proper initialization and
  cleanup
- Integrate Extism SDK for Wasm module loading and execution
- Support plugin configuration and hash verification
- Add FleetAutoscaleReview request/response handling for Wasm plugins
- Include comprehensive unit tests for Wasm policy functionality
- Add vendor dependencies: Extism SDK, Wazero runtime, Observe SDK, and
  OpenTelemetry proto

Work on agones-dev#4080
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/user-experience Pertaining to developers trying to use Agones, e.g. SDK, installation, etc kind/feature New features for Agones size/XL

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants