Skip to content

support endpoint picker based on header/dynamic metadata #6234

@Xunzhuo

Description

@Xunzhuo

Description:

Describe the desired behavior, what scenario it enables and how it
would be used.

AIBrix:

https://github.com/vllm-project/aibrix/blob/main/config/gateway/gateway.yaml looks like aibrix need to apply envoypatchpolicy in every new Gateway instance, thoughts on replacing it with other approaches?

  1. Route to extension server only when path /v1 + header route-strategy
  2. Route to endpoint based on target-pod header

Gateway-api inference extension also needs to pick endpoint based on header or metadata.

Based on Envoy Override Host Load Balancer Policy https://www.envoyproxy.io/docs/envoy/latest/api-v3/extensions/load_balancing_policies/override_host/v3/override_host.proto which supports a fallback IP that be used during a retry

Metadata

Metadata

Assignees

Labels

area/apiAPI-related issuesarea/llmIssues around LLM

Type

No type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions