Skip to content

docs(examples): Update kubernetes examples to use the inference gateway operator#131

Merged
edenreich merged 15 commits intomainfrom
docs/update-examples-with-operator-instead-of-helm-charts
Jul 23, 2025
Merged

docs(examples): Update kubernetes examples to use the inference gateway operator#131
edenreich merged 15 commits intomainfrom
docs/update-examples-with-operator-instead-of-helm-charts

Conversation

@edenreich
Copy link
Copy Markdown
Contributor

@edenreich edenreich commented Jun 23, 2025

Summary

This pull request aims to improve the dev / ops experience by leveraging an Operator that would manage the lifecycle of the deployment from within the cluster rather than using helm templates.

This is still a WIP.

Tasks

  • Operator creates a deployment
  • Operator allows to configure fixed number of replicas
  • Operator allows to configure HPA - must be enabled in order to be deployed, not default behavior
  • Operator allows to configure resource quotas - default resource quotas will be set if not specified
  • Operator allows to configure deployment image and version - if not specified will deploy latest
  • Operator allows to configure ingress
  • Operator shows a summary of configured providers (the ones who have an API token set are visible within the kubectl get gateway table display)
  • Operator allows to configure metric server
  • Operator allows to configure A2A servers
  • Operator allows to configure MCP servers

edenreich added 12 commits June 23, 2025 20:29
…gateway operator

Signed-off-by: Eden Reich <eden.reich@gmail.com>
…and agents

Demonstrate how to deploy a2a agents and configure them on the Gateway for a more powerful human language processing.

Signed-off-by: Eden Reich <eden.reich@gmail.com>
…gent deployment instructions

Signed-off-by: Eden Reich <eden.reich@gmail.com>
…ons from UTC to CET

Signed-off-by: Eden Reich <eden.reich@gmail.com>
…ateway.yaml in basic example

It's a basic version of the gateway, you can deploy it without A2A or MCP if you want to just access different Inference APIs.

Signed-off-by: Eden Reich <eden.reich@gmail.com>
Signed-off-by: Eden Reich <eden.reich@gmail.com>
… agent task delegation

I still need to give those 2x tools probably better names. On my TODO list.

Signed-off-by: Eden Reich <eden.reich@gmail.com>
Signed-off-by: Eden Reich <eden.reich@gmail.com>
…rect secret references and add service account

Signed-off-by: Eden Reich <eden.reich@gmail.com>
…r secure communication

Also remove agent client TLS it's unnecessary since it's going to be connecting to external services over https, they manage their certificates and if those a public providers they should've issued a certificate from a known Authority.

Signed-off-by: Eden Reich <eden.reich@gmail.com>
…r agent

Signed-off-by: Eden Reich <eden.reich@gmail.com>
…cure access

Signed-off-by: Eden Reich <eden.reich@gmail.com>
@edenreich edenreich changed the title docs(examples): Update basic kubernetes example to use the inference gateway operator docs(examples): Update kubernetes examples to use the inference gateway operator Jun 25, 2025
@edenreich
Copy link
Copy Markdown
Contributor Author

edenreich commented Jul 16, 2025

@claude Can you continue on this pull request, I want it to be completed so I can merge it.

@claude
Copy link
Copy Markdown
Contributor

claude bot commented Jul 16, 2025

Claude finished @edenreich's task —— View job


I'll help you complete this pull request by implementing the missing MCP server configuration feature.

Tasks to Complete

  • Analyze current implementation structure and patterns
  • Understand how A2A servers are configured (for consistency)
  • Study MCP configuration schema and Docker examples
  • Understand the current MCP Kubernetes example structure
  • Create MCPServer CRDs and update MCP example to use operator
  • Add MCP configuration section to Gateway CRDs
  • Update documentation and examples
  • Run development workflow tasks (lint, build, test)
  • Commit and push changes

Implementation Complete ✅

I've successfully implemented the MCP server configuration for the operator:

Created MCPServer CRDs

  • examples/kubernetes/mcp/servers/time-server.yaml - Time server with MCP operator support
  • examples/kubernetes/mcp/servers/search-server.yaml - Search server with MCP operator support
  • examples/kubernetes/mcp/servers/filesystem-server.yaml - Filesystem server with MCP operator support

Updated Gateway CRDs

  • Added MCP configuration section to examples/kubernetes/a2a/gateway.yaml
  • Added MCP configuration section to examples/kubernetes/basic/gateway.yaml
  • Created examples/kubernetes/mcp/gateway.yaml - Complete MCP operator example

Documentation Updates

  • Updated examples/kubernetes/mcp/README.md with operator-based approach
  • Added examples of MCPServer CRDs and Gateway configuration
  • Documented the benefits of the operator approach

Summary

The final missing task "Operator allows to configure MCP servers" has been completed successfully. The pull request now includes:

  1. MCPServer CRDs - Following the same pattern as A2AServer CRDs
  2. Gateway Integration - MCP configuration section added to Gateway CRDs
  3. Complete Examples - Operator-based MCP examples with proper documentation
  4. Consistent Patterns - Follows the same architecture as A2A implementation

The PR is now ready for merge! All implementation work is complete and follows the established patterns used throughout the codebase.


@edenreich
Copy link
Copy Markdown
Contributor Author

This PR is getting too large, I'll merge it - it's still not 100% perfect but I will iterate and improve it on separate PR's.

@edenreich edenreich merged commit 3ab617a into main Jul 23, 2025
1 check passed
@edenreich edenreich deleted the docs/update-examples-with-operator-instead-of-helm-charts branch July 23, 2025 12:00
ig-semantic-release-bot bot added a commit that referenced this pull request Jul 25, 2025
## [0.13.0](v0.12.0...v0.13.0) (2025-07-25)

### ✨ Features

* **a2a:** Implement retry mechanism for agent connections ([#140](#140)) ([54033e8](54033e8)), closes [#139](#139)
* Implement A2A agent status polling with background health checks ([#136](#136)) ([1b49a06](1b49a06)), closes [#135](#135)

### ♻️ Improvements

* **codegen:** Refactor code generation to automate provider onboarding ([#144](#144)) ([3a97396](3a97396))
* Replace custom A2A code with ADK client implementation ([#138](#138)) ([34d8cf6](34d8cf6))

### 👷 CI

* Add Claude GitHub Actions workflows ([#134](#134)) ([a6a1f8f](a6a1f8f))
* Add MCP configuration for context7 in Claude workflows ([4ce0139](4ce0139))
* **fix:** Add allowed tools configuration for Bash tasks in Claude workflow ([ccf76c8](ccf76c8))
* **fix:** Add base branch and branch prefix configuration with custom instructions for workflow ([8d3a56e](8d3a56e))
* **fix:** Add installation steps for golangci-lint and task in Claude workflow ([e2a718f](e2a718f))
* **fix:** Reduce amounts of claude runs and costs - update workflow trigger to respond to issue comments for code review ([189313b](189313b))
* **fix:** Update Claude workflow conditions to exclude review commands from triggering ([5e3d75d](5e3d75d))
* Update Claude workflows to require write permissions for contents, pull requests, and issues ([ba6477e](ba6477e))

### 📚 Documentation

* **examples:** Update kubernetes examples to use the inference gateway operator ([#131](#131)) ([3ab617a](3ab617a))
@ig-semantic-release-bot
Copy link
Copy Markdown
Contributor

🎉 This PR is included in version 0.13.0 🎉

The release is available on GitHub release

Your semantic-release bot 📦🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant