gateway-api-inference-extenstion support by LiorLieberman · Pull Request #55436 · istio/istio

LiorLieberman · 2025-03-09T23:33:03Z

Add initial support for gateway-api-inference extension

istio-testing · 2025-03-09T23:33:14Z

Hi @LiorLieberman. Thanks for your PR.

I'm waiting for a istio member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

ilrudie

Leaving a couple thoughts. Looks like a good direction on first look but this is a little outside my area of expertise so would benefit from more eyes or, perhaps, a deeper look from me.

pilot/pkg/config/kube/gateway/deploymentcontroller.go

ilrudie · 2025-03-10T13:46:29Z

pilot/pkg/config/kube/gateway/inferencepool.go

+// NewInferencePoolController constructs a new InferencePoolController and registers required informers.
+func NewInferencePoolController(client kube.Client) *InferencePoolController {
+	filter := kclient.Filter{ObjectFilter: client.ObjectFilter()}
+	pools := kclient.NewFiltered[*inferencev1alpha2.InferencePool](client, filter)


should probably be a delayed informer so that we don't block startup if the CRD is missing

quick question here: why are we not doing delayedinformers with gateways crds? https://github.com/istio/istio/blob/master/pilot/pkg/config/kube/gateway/deploymentcontroller.go#L186

istio/pilot/pkg/bootstrap/configcontroller.go

Line 191 in 658ed85

if s.kubeClient.CrdWatcher().WaitForCRD(gvr.KubernetesGateway, leaderStop) {

gates it.

Why do we use that vs delayedInformer? 2 reasons -- this code predates delayedInformer, and delayedInformer doesn't give a write client (though I think it could)

I think we do need a write client, to update status/labels etc. And it is gated the same way. Should we leave it like this then?

ilrudie · 2025-03-10T13:51:55Z

pilot/pkg/config/kube/gateway/inferencepool.go

+			Namespace: pool.GetNamespace(),
+		},
+	}
+	if shadowSvc.Labels == nil {


nit, you could just initialize this in the literal above

what if its an update?

pilot/pkg/config/kube/gateway/inferencepool.go

pilot/pkg/bootstrap/configcontroller.go

danehans · 2025-03-13T16:12:12Z

pilot/pkg/config/kube/gateway/conversion.go

+	endpointPickerPort string
+}
+
+func buildDestination(ctx configContext, to k8s.BackendRef, ns string, enforceRefGrant bool, k config.GroupVersionKind) (*istio.Destination, *inferencePoolConfig, *ConfigError) {


It would be nice not changing the return signature since *inferencePoolConfig is dropped in all but one place. Is it possible to add an optional metadata field to istio.Destination to hold the inferencePoolConfig? Thoughts @howardjohn @ilrudie

danehans · 2025-03-13T17:00:37Z

pilot/pkg/config/kube/gateway/inferencepool.go

+	// if !isManaged(pool) {
+	// 	log.Debugf("inference pool is not managed by this controller")
+	// 	return nil


You can pass the controllerName to InferencePoolController, watch HTTPRoutes (only one's with a matching status.parents[].controllerName`) that reference an InferencePool as a backendRef. I created kubernetes-sigs/gateway-api-inference-extension#489 to see if this can be simplified.

danehans · 2025-03-17T16:32:32Z

pilot/pkg/xds/filters/filters.go

+					},
+					Timeout: &durationpb.Duration{Seconds: 10},
+				},
+				FailureModeAllow: true,


TODO: Add support for failureMode.

Right, we should think about this a little more, theoretically, if I dont need faliureMode, original_dst is probably the easiest. I think. But TBD

howardjohn

LGTM for merging into experimental work. Nice work.

(possibly very) long term hopeful changes would include possibly:

ext_proc becoming first class vs specialized to inference
If above doesn't solve it, a way to configure the subset part of this
A way to avoid needing to deploy a service for the InferencePool
Support for the various modes (GRPCRoute, Waypoint, sidecar, etc)

All of these are perfectly fine for followups. TBD which, if any, would be blockers for merging to master which hasn't been discussed much yet.

istio.deps

pilot/pkg/bootstrap/configcontroller.go

pilot/pkg/config/kube/gateway/conversion.go

howardjohn · 2025-03-19T19:06:30Z

pilot/pkg/config/kube/gateway/inferencepool.go

+		} else {
+			// Update the service if it exists
+			// Note: We need to use Update instead of Patch for simpler implementation
+			// This means we might overwrite other fields, but that's the expected behavior


Note a variety of controllers add arbitrary labels/annotations to services after creation so this may be problematic. I don't mind for short term of course, especially as I hope this is temporary.

pilot/pkg/config/kube/gateway/inferencepool.go

pilot/pkg/networking/core/cluster_builder.go

pilot/pkg/networking/core/route/route.go

pilot/pkg/xds/endpoints/endpoint_builder.go

howardjohn · 2025-03-24T14:42:44Z

/ok-to-test

pilot/pkg/networking/core/cluster_builder.go

pilot/pkg/networking/core/listener_builder.go

hzxuzhonghu · 2025-03-25T01:58:02Z

pilot/pkg/networking/core/route/route.go

+		if len(routeNameParts) > 1 {
+			// TODO(liorlieberman): support configurable domain names
+			fqdn := fmt.Sprintf("%s.%s.svc.%s", routeNameParts[1], virtualService.Namespace, "cluster.local")
+			infPoolConfig = fqdn + ":" + routeNameParts[2]


I am doubting this is too hacky, seems you are depending on httpRoute.Name with "%%" to indicate the inference pool

hzxuzhonghu · 2025-03-25T01:59:30Z

pilot/pkg/xds/filters/filters.go

+				GrpcService: &core.GrpcService{
+					TargetSpecifier: &core.GrpcService_EnvoyGrpc_{
+						EnvoyGrpc: &core.GrpcService_EnvoyGrpc{
+							ClusterName: "dummy",


what does dummy mean

hzxuzhonghu · 2025-03-25T02:07:37Z

pilot/pkg/xds/filters/filters.go

+						Untyped: []string{constants.EnvoySubsetNamespace},
+					},
+					ForwardingNamespaces: &extproc.MetadataOptions_MetadataNamespaces{
+						Untyped: []string{constants.EnvoySubsetNamespace},


Forwarding what by default?

* Automator: update proxy@master in istio/istio@master (istio#55590) * add inferencepool reconcile draft * add global extProc draft * add support for infernecePool internal sematics * attempt to add extProcPerRoute * address comments * address feedback * add push_context changes * add inferencepool status handler * more nits * add tests and cleanup * add deploymentcontroller tests * more tests * furnish tests * gofmt * add temporaray exclude * more formatting * more stuff --------- Co-authored-by: Istio Automation <istio-testing-bot@google.com>

* gateway-api-inference-extenstion support (#55436) * Automator: update proxy@master in istio/istio@master (#55590) * add inferencepool reconcile draft * add global extProc draft * add support for infernecePool internal sematics * attempt to add extProcPerRoute * address comments * address feedback * add push_context changes * add inferencepool status handler * more nits * add tests and cleanup * add deploymentcontroller tests * more tests * furnish tests * gofmt * add temporaray exclude * more formatting * more stuff --------- Co-authored-by: Istio Automation <istio-testing-bot@google.com> * Clusterrole and auto gen (#55644) * Automator: update proxy@master in istio/istio@master (#55590) * add auto crds gen and clusterrole --------- Co-authored-by: Istio Automation <istio-testing-bot@google.com> * fix multiple inferencepools per route (#55683) * ensure inference route name list has all the parts (#55705) * change to extproc requestBody processing mode to FULL_DUPLEX_STREAMED for Gateway Inference Extension (#56089) * fix: change to FULL_DUPLEX_STREAMED processing mode for GIE * Automator: update common-files@master in istio/istio@master (#55878) * TMP: Regen due to odd something version update something somewhere for some reason * a * Automator: update proxy@master in istio/istio@master (#56094) * Create ambient multinetwork flag (#55991) * Create new multinetwork flag Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * Add new topology Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * Pass Values.global.network to ztunnel Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * Skip workload to workload test if workloads are in different clusters Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * Fix several tests that aren't ready for multicluster Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * Fix some more tests Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * Fix all non-CNI/taint ambient tests Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * Fix lint Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> --------- Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> --------- Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> Co-authored-by: Aslak Knutsen <aslak@4fs.no> Co-authored-by: Istio Automation <istio-testing-bot@google.com> Co-authored-by: Keith Mattix II <keithmattix@microsoft.com> * fix: The Gateway Inference Extension doesn't receive response bodies (#56463) * Set the Gateway Inference Extension EndPointPicker to receive response bodies Signed-off-by: Shmuel Kallner <kallner@il.ibm.com> * Updated the test to look for extra configuration parameters Signed-off-by: Shmuel Kallner <kallner@il.ibm.com> * Formatting change after running make gen Signed-off-by: Shmuel Kallner <kallner@il.ibm.com> * Correct typo in comment Signed-off-by: Shmuel Kallner <kallner@il.ibm.com> * Updates from running make gen Signed-off-by: Shmuel Kallner <kallner@il.ibm.com> --------- Signed-off-by: Shmuel Kallner <kallner@il.ibm.com> * fix: update generated files after rebase The rebase on master requires a `make gen`. related to #56612 * gw-inference: support new envoy override_host lb policy (#56623) * gw-inference: support new envoy override_host lb policy * mainly formatting * cleanup and tests for inference-lb-policy (#56655) * improve inference config propagation (#56660) * add extra field for Config for inferencepool config propagation * fix tests * address feedback * dont mutate external map inside the itr * Refactor inference extension support to krt (#56656) * Refactor inference extension support to krt We do this mainly by integrating it with the rest of the gateway controller. The conversion logic now has shadow inference services as an output we can query from pushcontext or whatever else Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * Create shadow service outside of the collection Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * Address PR comments Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * Actually run the queue Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * Fixups Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * Fallback to deep equal Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> --------- Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * fix(gw-inference): Add extra to the VirtualService Config clone/copy (#56775) * Add inferencepools by gateways index (#56777) Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * replace label-based gateway opt in (#56778) * replace label-based gateway opt in * replace label names * add-tests * feat(gw-inference): Add support for InferencePool reference grants (#56832) * feat(gw-inference): support InferencePool as To target for ReferenceGrants Allow cross namespace references for InferencePools. This allow HTTPRoutes to target InferencePools in another namespace. ReferenceGrants.BackendAllowed now takes a GVK to the to Object as an argument along side namespace and name. If GVK is unknown it will default to Service. * test: add resource grant for inferencepool tests * Inference pool Status handling and cross namespace references (#56792) * feat(gw-inference): Improve InferencePool.Status handling Includes tests for InferencePool.Status handling and cross namespace lookups. Test for the following basic scenarios: * Cross namespace references * Removing old state * Keeping old state from other controllers * ResolvedRef conditions * Accepted conditions A HTTPRoute can target BackendRefs in namespaces besides it self, so when we try to do a reverse lookup from InferencePool to find parent Gateways via HTTPRoute we need to list all HTTPRoutes in all Namespaces. Added InferencePool to HTTPRoute.BackendRef index for easy access. lookups. relates to #56621 * fix: safe guard possible nil access * test: add copyright and lint fixes * fix: remove default status message when we take control * fix: remove unused status types hanging around from earlier runs. We want to keep the ones we know to update state or not * fix(test): lint line to long * add failuremode support on inferencepool (#56819) * add failuremode support on inferencepool * update proxy sha * fix conversion_test.go * fix crl test * fix(gw-inference): Refactored InferencePool Collection (#56831) * fix: refactored InferenceController Refactored into 3 main pieces; 0. fetch routes via index 1. find our gateway parents 2. create shadow info 3. create status Update to latest GIE to get API changes Added support for checking the HTTPRoute Accepted status as part of InferencePool Accepted status. * fix: reuse isInferencePoolBackendRef in index * fix: use ptr.OrEmpty where applicable * run make gen * fix(gw-inference): Shadow service should only exist if it has to (#56839) The InferencePool should only create a shadow service if a HTTPRoute that has a Gateway that we control is connected. Else do nothing. Clean up Service and Status if the HTTPRoute disconnects. * Add releasenote and disable gw-inference by default (#56848) * add releasenotes and disable-by-default * add test featureflag * increase binary sizes due to krt * feedback and cleanup (#56866) * (gaie): Fixups from master PR (#56899) * Fixups Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * Tackle other TODO Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> --------- Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> --------- Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> Signed-off-by: Shmuel Kallner <kallner@il.ibm.com> Co-authored-by: Istio Automation <istio-testing-bot@google.com> Co-authored-by: Aslak Knutsen <aslak@4fs.no> Co-authored-by: Keith Mattix II <keithmattix@microsoft.com> Co-authored-by: Shmuel Kallner <kallner@il.ibm.com> Co-authored-by: Aslak Knutsen <aslak.tux@gmail.com>

…6845) * gateway-api-inference-extenstion support (istio#55436) * Automator: update proxy@master in istio/istio@master (istio#55590) * add inferencepool reconcile draft * add global extProc draft * add support for infernecePool internal sematics * attempt to add extProcPerRoute * address comments * address feedback * add push_context changes * add inferencepool status handler * more nits * add tests and cleanup * add deploymentcontroller tests * more tests * furnish tests * gofmt * add temporaray exclude * more formatting * more stuff --------- Co-authored-by: Istio Automation <istio-testing-bot@google.com> * Clusterrole and auto gen (istio#55644) * Automator: update proxy@master in istio/istio@master (istio#55590) * add auto crds gen and clusterrole --------- Co-authored-by: Istio Automation <istio-testing-bot@google.com> * fix multiple inferencepools per route (istio#55683) * ensure inference route name list has all the parts (istio#55705) * change to extproc requestBody processing mode to FULL_DUPLEX_STREAMED for Gateway Inference Extension (istio#56089) * fix: change to FULL_DUPLEX_STREAMED processing mode for GIE * Automator: update common-files@master in istio/istio@master (istio#55878) * TMP: Regen due to odd something version update something somewhere for some reason * a * Automator: update proxy@master in istio/istio@master (istio#56094) * Create ambient multinetwork flag (istio#55991) * Create new multinetwork flag Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * Add new topology Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * Pass Values.global.network to ztunnel Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * Skip workload to workload test if workloads are in different clusters Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * Fix several tests that aren't ready for multicluster Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * Fix some more tests Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * Fix all non-CNI/taint ambient tests Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * Fix lint Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> --------- Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> --------- Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> Co-authored-by: Aslak Knutsen <aslak@4fs.no> Co-authored-by: Istio Automation <istio-testing-bot@google.com> Co-authored-by: Keith Mattix II <keithmattix@microsoft.com> * fix: The Gateway Inference Extension doesn't receive response bodies (istio#56463) * Set the Gateway Inference Extension EndPointPicker to receive response bodies Signed-off-by: Shmuel Kallner <kallner@il.ibm.com> * Updated the test to look for extra configuration parameters Signed-off-by: Shmuel Kallner <kallner@il.ibm.com> * Formatting change after running make gen Signed-off-by: Shmuel Kallner <kallner@il.ibm.com> * Correct typo in comment Signed-off-by: Shmuel Kallner <kallner@il.ibm.com> * Updates from running make gen Signed-off-by: Shmuel Kallner <kallner@il.ibm.com> --------- Signed-off-by: Shmuel Kallner <kallner@il.ibm.com> * fix: update generated files after rebase The rebase on master requires a `make gen`. related to istio#56612 * gw-inference: support new envoy override_host lb policy (istio#56623) * gw-inference: support new envoy override_host lb policy * mainly formatting * cleanup and tests for inference-lb-policy (istio#56655) * improve inference config propagation (istio#56660) * add extra field for Config for inferencepool config propagation * fix tests * address feedback * dont mutate external map inside the itr * Refactor inference extension support to krt (istio#56656) * Refactor inference extension support to krt We do this mainly by integrating it with the rest of the gateway controller. The conversion logic now has shadow inference services as an output we can query from pushcontext or whatever else Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * Create shadow service outside of the collection Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * Address PR comments Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * Actually run the queue Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * Fixups Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * Fallback to deep equal Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> --------- Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * fix(gw-inference): Add extra to the VirtualService Config clone/copy (istio#56775) * Add inferencepools by gateways index (istio#56777) Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * replace label-based gateway opt in (istio#56778) * replace label-based gateway opt in * replace label names * add-tests * feat(gw-inference): Add support for InferencePool reference grants (istio#56832) * feat(gw-inference): support InferencePool as To target for ReferenceGrants Allow cross namespace references for InferencePools. This allow HTTPRoutes to target InferencePools in another namespace. ReferenceGrants.BackendAllowed now takes a GVK to the to Object as an argument along side namespace and name. If GVK is unknown it will default to Service. * test: add resource grant for inferencepool tests * Inference pool Status handling and cross namespace references (istio#56792) * feat(gw-inference): Improve InferencePool.Status handling Includes tests for InferencePool.Status handling and cross namespace lookups. Test for the following basic scenarios: * Cross namespace references * Removing old state * Keeping old state from other controllers * ResolvedRef conditions * Accepted conditions A HTTPRoute can target BackendRefs in namespaces besides it self, so when we try to do a reverse lookup from InferencePool to find parent Gateways via HTTPRoute we need to list all HTTPRoutes in all Namespaces. Added InferencePool to HTTPRoute.BackendRef index for easy access. lookups. relates to istio#56621 * fix: safe guard possible nil access * test: add copyright and lint fixes * fix: remove default status message when we take control * fix: remove unused status types hanging around from earlier runs. We want to keep the ones we know to update state or not * fix(test): lint line to long * add failuremode support on inferencepool (istio#56819) * add failuremode support on inferencepool * update proxy sha * fix conversion_test.go * fix crl test * fix(gw-inference): Refactored InferencePool Collection (istio#56831) * fix: refactored InferenceController Refactored into 3 main pieces; 0. fetch routes via index 1. find our gateway parents 2. create shadow info 3. create status Update to latest GIE to get API changes Added support for checking the HTTPRoute Accepted status as part of InferencePool Accepted status. * fix: reuse isInferencePoolBackendRef in index * fix: use ptr.OrEmpty where applicable * run make gen * fix(gw-inference): Shadow service should only exist if it has to (istio#56839) The InferencePool should only create a shadow service if a HTTPRoute that has a Gateway that we control is connected. Else do nothing. Clean up Service and Status if the HTTPRoute disconnects. * Add releasenote and disable gw-inference by default (istio#56848) * add releasenotes and disable-by-default * add test featureflag * increase binary sizes due to krt * feedback and cleanup (istio#56866) * (gaie): Fixups from master PR (istio#56899) * Fixups Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * Tackle other TODO Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> --------- Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> --------- Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> Signed-off-by: Shmuel Kallner <kallner@il.ibm.com> Co-authored-by: Istio Automation <istio-testing-bot@google.com> Co-authored-by: Aslak Knutsen <aslak@4fs.no> Co-authored-by: Keith Mattix II <keithmattix@microsoft.com> Co-authored-by: Shmuel Kallner <kallner@il.ibm.com> Co-authored-by: Aslak Knutsen <aslak.tux@gmail.com>

…#57097) * gateway-api-inference-extenstion support (#55436) * Automator: update proxy@master in istio/istio@master (#55590) * add inferencepool reconcile draft * add global extProc draft * add support for infernecePool internal sematics * attempt to add extProcPerRoute * address comments * address feedback * add push_context changes * add inferencepool status handler * more nits * add tests and cleanup * add deploymentcontroller tests * more tests * furnish tests * gofmt * add temporaray exclude * more formatting * more stuff --------- * Clusterrole and auto gen (#55644) * Automator: update proxy@master in istio/istio@master (#55590) * add auto crds gen and clusterrole --------- * fix multiple inferencepools per route (#55683) * ensure inference route name list has all the parts (#55705) * change to extproc requestBody processing mode to FULL_DUPLEX_STREAMED for Gateway Inference Extension (#56089) * fix: change to FULL_DUPLEX_STREAMED processing mode for GIE * Automator: update common-files@master in istio/istio@master (#55878) * TMP: Regen due to odd something version update something somewhere for some reason * a * Automator: update proxy@master in istio/istio@master (#56094) * Create ambient multinetwork flag (#55991) * Create new multinetwork flag * Add new topology * Pass Values.global.network to ztunnel * Skip workload to workload test if workloads are in different clusters * Fix several tests that aren't ready for multicluster * Fix some more tests * Fix all non-CNI/taint ambient tests * Fix lint --------- --------- * fix: The Gateway Inference Extension doesn't receive response bodies (#56463) * Set the Gateway Inference Extension EndPointPicker to receive response bodies * Updated the test to look for extra configuration parameters * Formatting change after running make gen * Correct typo in comment * Updates from running make gen --------- * fix: update generated files after rebase The rebase on master requires a `make gen`. related to #56612 * gw-inference: support new envoy override_host lb policy (#56623) * gw-inference: support new envoy override_host lb policy * mainly formatting * cleanup and tests for inference-lb-policy (#56655) * improve inference config propagation (#56660) * add extra field for Config for inferencepool config propagation * fix tests * address feedback * dont mutate external map inside the itr * Refactor inference extension support to krt (#56656) * Refactor inference extension support to krt We do this mainly by integrating it with the rest of the gateway controller. The conversion logic now has shadow inference services as an output we can query from pushcontext or whatever else * Create shadow service outside of the collection * Address PR comments * Actually run the queue * Fixups * Fallback to deep equal --------- * fix(gw-inference): Add extra to the VirtualService Config clone/copy (#56775) * Add inferencepools by gateways index (#56777) * replace label-based gateway opt in (#56778) * replace label-based gateway opt in * replace label names * add-tests * feat(gw-inference): Add support for InferencePool reference grants (#56832) * feat(gw-inference): support InferencePool as To target for ReferenceGrants Allow cross namespace references for InferencePools. This allow HTTPRoutes to target InferencePools in another namespace. ReferenceGrants.BackendAllowed now takes a GVK to the to Object as an argument along side namespace and name. If GVK is unknown it will default to Service. * test: add resource grant for inferencepool tests * Inference pool Status handling and cross namespace references (#56792) * feat(gw-inference): Improve InferencePool.Status handling Includes tests for InferencePool.Status handling and cross namespace lookups. Test for the following basic scenarios: * Cross namespace references * Removing old state * Keeping old state from other controllers * ResolvedRef conditions * Accepted conditions A HTTPRoute can target BackendRefs in namespaces besides it self, so when we try to do a reverse lookup from InferencePool to find parent Gateways via HTTPRoute we need to list all HTTPRoutes in all Namespaces. Added InferencePool to HTTPRoute.BackendRef index for easy access. lookups. relates to #56621 * fix: safe guard possible nil access * test: add copyright and lint fixes * fix: remove default status message when we take control * fix: remove unused status types hanging around from earlier runs. We want to keep the ones we know to update state or not * fix(test): lint line to long * add failuremode support on inferencepool (#56819) * add failuremode support on inferencepool * update proxy sha * fix conversion_test.go * fix crl test * fix(gw-inference): Refactored InferencePool Collection (#56831) * fix: refactored InferenceController Refactored into 3 main pieces; 0. fetch routes via index 1. find our gateway parents 2. create shadow info 3. create status Update to latest GIE to get API changes Added support for checking the HTTPRoute Accepted status as part of InferencePool Accepted status. * fix: reuse isInferencePoolBackendRef in index * fix: use ptr.OrEmpty where applicable * run make gen * fix(gw-inference): Shadow service should only exist if it has to (#56839) The InferencePool should only create a shadow service if a HTTPRoute that has a Gateway that we control is connected. Else do nothing. Clean up Service and Status if the HTTPRoute disconnects. * Add releasenote and disable gw-inference by default (#56848) * add releasenotes and disable-by-default * add test featureflag * increase binary sizes due to krt * feedback and cleanup (#56866) * (gaie): Fixups from master PR (#56899) * Fixups * Tackle other TODO --------- --------- Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> Signed-off-by: Shmuel Kallner <kallner@il.ibm.com> Co-authored-by: Lior Lieberman <liorlieberman@google.com> Co-authored-by: Istio Automation <istio-testing-bot@google.com> Co-authored-by: Keith Mattix II <keithmattix@microsoft.com> Co-authored-by: Shmuel Kallner <kallner@il.ibm.com>

istio-testing added do-not-merge/work-in-progress Block merging of a PR because it isn't ready yet. needs-rebase Indicates a PR needs to be rebased before being merged size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Mar 9, 2025

istio-testing added the needs-ok-to-test label Mar 9, 2025

LiorLieberman force-pushed the inferencepool-reco branch 2 times, most recently from 505ed9c to fc881d8 Compare March 10, 2025 00:17

ilrudie reviewed Mar 11, 2025

View reviewed changes

danehans reviewed Mar 13, 2025

View reviewed changes

danehans reviewed Mar 17, 2025

View reviewed changes

istio-testing and others added 5 commits March 19, 2025 12:36

Automator: update proxy@master in istio/istio@master (istio#55590)

0daec5d

add inferencepool reconcile draft

1ac686a

add global extProc draft

f1a3537

add support for infernecePool internal sematics

12af799

attempt to add extProcPerRoute

42a4089

LiorLieberman force-pushed the inferencepool-reco branch from 2a16eb6 to c2902a3 Compare March 19, 2025 17:03

istio-testing removed the needs-rebase Indicates a PR needs to be rebased before being merged label Mar 19, 2025

LiorLieberman changed the base branch from master to experimental-gwapi-inference-extension March 19, 2025 17:53

istio-testing added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Mar 19, 2025

LiorLieberman changed the base branch from experimental-gwapi-inference-extension to master March 19, 2025 17:53

istio-testing added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Mar 19, 2025

LiorLieberman changed the base branch from master to experimental-gwapi-inference-extension March 19, 2025 18:03

address comments

4173b46

LiorLieberman force-pushed the inferencepool-reco branch from c2902a3 to 4173b46 Compare March 19, 2025 18:21

howardjohn approved these changes Mar 19, 2025

View reviewed changes

LiorLieberman added 2 commits March 19, 2025 21:53

address feedback

b0f82cb

add push_context changes

e6c008a

LiorLieberman marked this pull request as ready for review March 20, 2025 16:11

more tests

923d9ba

LiorLieberman requested a review from howardjohn March 22, 2025 00:54

istio-testing added ok-to-test Set this label allow normal testing to take place for a PR not submitted by an Istio org member. and removed needs-ok-to-test labels Mar 24, 2025

furnish tests

110153a

LiorLieberman force-pushed the inferencepool-reco branch from 3fdb3a9 to 110153a Compare March 24, 2025 15:01

gofmt

b3e9327

LiorLieberman force-pushed the inferencepool-reco branch from 2644149 to b3e9327 Compare March 24, 2025 15:26

add temporaray exclude

2697e5c

LiorLieberman force-pushed the inferencepool-reco branch from 8431559 to 2697e5c Compare March 24, 2025 16:07

LiorLieberman added 2 commits March 24, 2025 16:19

more formatting

b9801eb

more stuff

5699086

LiorLieberman requested a review from a team as a code owner March 24, 2025 16:24

istio-testing merged commit 4e7e188 into istio:experimental-gwapi-inference-extension Mar 24, 2025
28 checks passed

hzxuzhonghu reviewed Mar 25, 2025

View reviewed changes

danehans mentioned this pull request Jul 15, 2025

add comments to ProcessingMode kgateway-dev/kgateway#11646

Closed

istio-policy-bot mentioned this pull request Jul 22, 2025

[release-1.27] Move Alpha Gateway API Inference Extension support to master (#56845) #57097

Merged

Conversation

LiorLieberman commented Mar 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

istio-testing commented Mar 9, 2025

Uh oh!

ilrudie left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

howardjohn left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

howardjohn commented Mar 24, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

LiorLieberman commented Mar 9, 2025 •

edited

Loading