-
Notifications
You must be signed in to change notification settings - Fork 708
Closed
1 / 21 of 2 issues completed
Copy link
Labels
Milestone
Description
Description:
Describe the desired behavior, what scenario it enables and how it
would be used.
AIBrix:
https://github.com/vllm-project/aibrix/blob/main/config/gateway/gateway.yaml looks like aibrix need to apply envoypatchpolicy in every new Gateway instance, thoughts on replacing it with other approaches?
- Route to extension server only when path /v1 + header route-strategy
- Route to endpoint based on target-pod header
Gateway-api inference extension also needs to pick endpoint based on header or metadata.
Based on Envoy Override Host Load Balancer Policy https://www.envoyproxy.io/docs/envoy/latest/api-v3/extensions/load_balancing_policies/override_host/v3/override_host.proto which supports a fallback IP that be used during a retry
Reactions are currently unavailable