Skip to content

[InferenceClient] Server-side auto-routing for conversational task#1810

Merged
Wauplin merged 3 commits into
mainfrom
server-side-auto-routing-for-conversational
Oct 21, 2025
Merged

[InferenceClient] Server-side auto-routing for conversational task#1810
Wauplin merged 3 commits into
mainfrom
server-side-auto-routing-for-conversational

Conversation

@Wauplin

@Wauplin Wauplin commented Oct 17, 2025

Copy link
Copy Markdown
Contributor

Equivalent Python PR: huggingface/huggingface_hub#3448

Discussed in private DMs.

Now that we have server-side routing on https://router.huggingface.co/v1/chat/completions, it's best to use it in the JS client (centralized logic between JS and Python clients + saves 1 HTTP call). We still keep client-side routing for all other tasks.

@Wauplin Wauplin changed the title Server-side auto-routing for conversational task [InferenceClient] Server-side auto-routing for conversational task Oct 17, 2025
@Wauplin Wauplin merged commit 027c4d2 into main Oct 21, 2025
5 checks passed
@Wauplin Wauplin deleted the server-side-auto-routing-for-conversational branch October 21, 2025 11:43
SBrandeis added a commit that referenced this pull request Nov 5, 2025
Follow-up to #1810 

Applies the same changes to the streaming counterpart
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants