Conversation
**Commit Message** This removes the inference extension related code from the controller to reduce the size of the refactoring PR #629. We need to do the complete redo on inference extension after #629, so this doesn't mean that we drop the support for it. --------- Signed-off-by: Takeshi Yoneda <t.y.mathetake@gmail.com>
Signed-off-by: Takeshi Yoneda <t.y.mathetake@gmail.com>
Signed-off-by: Takeshi Yoneda <t.y.mathetake@gmail.com>
|
ok all good except that i need to backfill the unit tests on the new gateway reconciler and gateway pod mutators. other than that, ready |
| // Note: when multiple AIGatewayRoute resources are attached to the same Gateway, and each | ||
| // AIGatewayRoute has a different resource configuration, the ai-gateway will pick one of them | ||
| // to configure the resource requirements of the external processor container. | ||
| // |
There was a problem hiding this comment.
i want some ideas on this - maybe we might want a "gateway-level" CRD to define things like this
There was a problem hiding this comment.
I think we can add the resource requirement to the ai gateway controller flags.
manifests/charts/ai-gateway-helm/templates/admission_webhook.yaml
Outdated
Show resolved
Hide resolved
|
i think this lacks lots of code comments compared to the usual changes i did ... |
|
i think i need to add one additional end to end test where we apply multiple AIGatewayRoutes |
aabchoo
left a comment
There was a problem hiding this comment.
minor comments still reviewing
|
i will refactor around how certs are created etc tomorrow, following a nice advice by @arkodg |
|
|
||
| var slogLevel slog.Level | ||
| if err = slogLevel.UnmarshalText([]byte(*extProcLogLevelPtr)); err != nil { | ||
| if err := slogLevel.UnmarshalText([]byte(*extProcLogLevelPtr)); err != nil { |
There was a problem hiding this comment.
curious: why use := when err is defined a few lines above?
| // PostVirtualHostModify allows an extension to modify the virtual hosts in the xDS config. | ||
| // | ||
| // Currently, this replaces the route that has "x-ai-eg-selected-route" pointing to "original_destination_cluster" to route to the original destination cluster. | ||
| func (s *Server) PostVirtualHostModify(_ context.Context, req *egextension.PostVirtualHostModifyRequest) (*egextension.PostVirtualHostModifyResponse, error) { |
There was a problem hiding this comment.
i should've left note: this code here was not used at all after InfExt controller code removal #632, and I didn't want to fix the test for this unused code hence I removed
| // httpHeaderName: x-ai-eg-original-dst | ||
| // useHttpHeader: true | ||
| // type: ORIGINAL_DST | ||
| if !extProcUDSExist { |
There was a problem hiding this comment.
note: this uds cluster creation was necessary to attach the UDS cluster to the upstream filter below since now the extension policy was created per-gateway, not per-route, so we needed to create a unique name UDS cluster that can be attached below
Signed-off-by: Takeshi Yoneda <t.y.mathetake@gmail.com>
Signed-off-by: Takeshi Yoneda <t.y.mathetake@gmail.com>
|
ok it seems e2e is flaky... i need to fix |
|
hmm still flaky... |
Signed-off-by: Takeshi Yoneda <t.y.mathetake@gmail.com>
Signed-off-by: Takeshi Yoneda <t.y.mathetake@gmail.com>
|
@yuzisun @wengyao04 PTAL |
Signed-off-by: Takeshi Yoneda <t.y.mathetake@gmail.com>
|
man flake |
Signed-off-by: Takeshi Yoneda <t.y.mathetake@gmail.com>
Signed-off-by: Takeshi Yoneda <t.y.mathetake@gmail.com>
Signed-off-by: Takeshi Yoneda <t.y.mathetake@gmail.com>
# Conflicts: # cmd/aigw/run.go
# Conflicts: # internal/controller/ai_gateway_route.go # internal/controller/ai_gateway_route_test.go
Signed-off-by: Takeshi Yoneda <t.y.mathetake@gmail.com>
Signed-off-by: Takeshi Yoneda <t.y.mathetake@gmail.com>
Signed-off-by: Takeshi Yoneda <t.y.mathetake@gmail.com>
Commit Message
This commit refactors the internal on how the ext proc is deployed. Specifically, this switches to insert the ext proc container as a sidecar container of Envoy pods created by Envoy Gateway. This is another large refactoring that turned out necessary for #599. This utilizes the mutating webhook to insert the extproc container Envoy pods.
Making the extproc as as sidecar means that we now have a one-to-one mapping between Gateway and the extproc hence this naturally resolves the previously known limitation #509 and now users can attach multiple AIGatewayRoute(s) to one Gateway.
Implementation note: since the volume mounts only work in the namespace-scoped way, use-created secrets (like API Keys) cannot be mounted by the extproc as it runs in "envoy-gateway-system" namespace. To resolve this, now the controller reads the secret and embed the read credentials into the "extproc secret" (which is previously known as "extproc configmap") together with routing, matching and backend information. That secret is written in the "envoy-gateway-system" namespace hence it can be mounted by the extproc container.
Related Issues/PRs (if applicable)
Resolves #509
Resolves #621