-
Notifications
You must be signed in to change notification settings - Fork 199
Closed
Labels
enhancementNew feature or requestNew feature or request
Description
Description:
DynamicLoadBalancing extends the gateway routing system by picking the endpoints(Pod IPs etc.)directly, it's great. But after I checked the code, I found two limitations here:
- From the API perspective, the
AIGatewayRouteparticularly, only AIE is supported, see the code link below. For users don't employ AIE like us, we hope to have this support as well. I know we can update the configmap to enable this, but it's a bit complex.ai-gateway/internal/controller/ai_gateway_route.go
Lines 295 to 302 in be2b479
if isInferencePoolRef(backendRef) { var pool *gwaiev1a2.InferencePool var referencedAIServiceBackends []aigv1a1.AIServiceBackend pool, referencedAIServiceBackends, err = c.getPoolAndReferencedAIServiceBackends(ctx, aiGatewayRoute.Namespace, backendRef.Name) if err != nil { return fmt.Errorf("failed to get pool and referenced AIServiceBackends: %w", err) } ecBackendConfig.DynamicLoadBalancing, err = c.createDynamicLoadBalancing(ctx, i, j, pool, referencedAIServiceBackends) - The Pod IPs are attached to the Backend configurations, which means once Pods updates, we have to update the configurations. Have no idea whether this is an efficient way, maybe still useful to some users with static Pod IPs, but to us, we hope to manage the Pods ourselves. So we may implement a new DynamicLoadBalancing to replace the default one.
This is a blocking issue for us, what we're doing right now is we're building the metrics-based router, we have all the Pods informations by collecting necessary metrics and want to bake it as a plugin into the Envoy AI Gateway. So within the dynamicLoadBalaner, we can make wise routing decisions by picking Pod endpoint directly.
Correct me if I misunderstood the system design here. Thanks!
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or request