Skip to content

[Endpoints] [9/x] Add provider, model, and configuration handling#19009

Merged
BenWilson2 merged 1 commit intomlflow:masterfrom
BenWilson2:stack/endpoints/litellm
Dec 11, 2025
Merged

[Endpoints] [9/x] Add provider, model, and configuration handling#19009
BenWilson2 merged 1 commit intomlflow:masterfrom
BenWilson2:stack/endpoints/litellm

Conversation

@BenWilson2
Copy link
Member

@BenWilson2 BenWilson2 commented Nov 24, 2025

🥞 Stacked PR

Use this link to review incremental changes.


Related Issues/PRs

#xxx

What changes are proposed in this pull request?

Adds backend APIs for fetching providers, models from providers, and provider-specific configurations needed to fulfill a request. Adds a new dependency to the [genai] optional installation to support these backend APIs.

How is this PR tested?

  • Existing unit/integration tests
  • New unit/integration tests
  • Manual tests

Does this PR require documentation update?

  • No. You can skip the rest of this section.
  • Yes. I've updated:
    • Examples
    • API references
    • Instructions

Release Notes

Is this a user-facing change?

  • No. You can skip the rest of this section.
  • Yes. Give a description of this change to be included in the release notes for MLflow users.

What component(s), interfaces, languages, and integrations does this PR affect?

Components

  • area/tracking: Tracking Service, tracking client APIs, autologging
  • area/models: MLmodel format, model serialization/deserialization, flavors
  • area/model-registry: Model Registry service, APIs, and the fluent client calls for Model Registry
  • area/scoring: MLflow Model server, model deployment tools, Spark UDFs
  • area/evaluation: MLflow model evaluation features, evaluation metrics, and evaluation workflows
  • area/gateway: MLflow AI Gateway client APIs, server, and third-party integrations
  • area/prompts: MLflow prompt engineering features, prompt templates, and prompt management
  • area/tracing: MLflow Tracing features, tracing APIs, and LLM tracing functionality
  • area/projects: MLproject format, project running backends
  • area/uiux: Front-end, user experience, plotting, JavaScript, JavaScript dev server
  • area/build: Build and test infrastructure for MLflow
  • area/docs: MLflow documentation pages

How should the PR be classified in the release notes? Choose one:

  • rn/none - No description will be included. The PR will be mentioned only by the PR number in the "Small Bugfixes and Documentation Updates" section
  • rn/breaking-change - The PR will be mentioned in the "Breaking Changes" section
  • rn/feature - A new user-facing feature worth mentioning in the release notes
  • rn/bug-fix - A user-facing bug fix worth mentioning in the release notes
  • rn/documentation - A user-facing documentation change worth mentioning in the release notes

Should this PR be included in the next patch release?

Yes should be selected for bug fixes, documentation updates, and other small changes. No should be selected for new features and larger changes. If you're unsure about the release classification of this PR, leave this unchecked to let the maintainers decide.

What is a minor/patch release?
  • Minor release: a release that increments the second part of the version number (e.g., 1.2.0 -> 1.3.0).
    Bug fixes, doc updates and new features usually go into minor releases.
  • Patch release: a release that increments the third part of the version number (e.g., 1.2.0 -> 1.2.1).
    Bug fixes and doc updates usually go into patch releases.
  • Yes (this PR will be cherry-picked and included in the next patch release)
  • No (this PR will be included in the next minor release)

@BenWilson2 BenWilson2 changed the title dynamic providers [Endpoints] [9/x] Add provider, model, and configuration handling Nov 24, 2025
@BenWilson2 BenWilson2 marked this pull request as ready for review November 24, 2025 22:06
@BenWilson2 BenWilson2 force-pushed the stack/endpoints/litellm branch from 6a9292e to 4c0e39f Compare November 24, 2025 22:19
@github-actions github-actions bot added area/tracking Tracking service, tracking client APIs, autologging rn/feature Mention under Features in Changelogs. labels Nov 24, 2025
@BenWilson2 BenWilson2 force-pushed the stack/endpoints/litellm branch 2 times, most recently from e176d7d to 00d7a50 Compare November 26, 2025 03:40
@BenWilson2 BenWilson2 force-pushed the stack/endpoints/litellm branch from 00d7a50 to de3bd35 Compare November 27, 2025 06:31
@BenWilson2 BenWilson2 force-pushed the stack/endpoints/litellm branch 2 times, most recently from 85037f1 to 0526203 Compare December 2, 2025 02:14
@BenWilson2 BenWilson2 force-pushed the stack/endpoints/litellm branch 2 times, most recently from 93c2b51 to dca21ae Compare December 2, 2025 22:49
@github-actions
Copy link
Contributor

github-actions bot commented Dec 2, 2025

Documentation preview for 146bd37 is available at:

More info
  • Ignore this comment if this PR does not change the documentation.
  • The preview is updated when a new commit is pushed to this PR.
  • This comment was created by this workflow run.
  • The documentation was built by this workflow run.

@BenWilson2 BenWilson2 force-pushed the stack/endpoints/litellm branch 3 times, most recently from 0bcf261 to 2546def Compare December 3, 2025 22:36
@BenWilson2 BenWilson2 force-pushed the stack/endpoints/litellm branch from 3c5d12d to 02deb34 Compare December 9, 2025 20:27
@BenWilson2
Copy link
Member Author

@BenWilson2 I thought we would just rely on LiteLLM because we don't want to manage the custom mapping. Did we change the decision? At least we should not have both? If we need to maintain the static mapping anyway, we don't need to litellm-based backend handler and just share the config file between UI and backend.

@B-Step62 litellm responds with a list of all optional valid fields, but doesn't organize them into "these are the ones that are used together", hence the need to have the static mapping. It's not ideal, but it's better than having hard-mappings of all providers AND allows us to validate that the backend can support or not support a given provider (which is why for this version I surmised that allowing an editable connection configuration might be a bad idea for a random provider - a user wouldn't know until after having spent time configuring an endpoint that it's valid or not).

For the static mapping, I figured that it would be less confusing to users if we provided the auth groupings that are required for these providers instead of a full list of "choose the ones you know you need" to reduce the cognitive load when creating an endpoint with a given selectable configuration.

Open to hear some thoughts on alternatives here!

@BenWilson2
Copy link
Member Author

Btw, for provider specific configuration, can we reuse our configurations for existing gateway? These classes already specifies the necessary information to make a LLM call for a specific provider. We still need to rely on LiteLLM for models though.

@TomeHirata the config that we have in Gateway doesn't have the other fields that we want to use for UI to display for helper text / easily legible naming or information. They do contain the field mappings for validation, but I felt that it might be better to contain the full config for these special case providers in a single location (also didn't want to make a future deprecated portion of the code base linked to a new feature that would complicate cleanup in the future).

@BenWilson2 BenWilson2 force-pushed the stack/endpoints/litellm branch 2 times, most recently from f076132 to dd53418 Compare December 9, 2025 22:45
@TomeHirata
Copy link
Collaborator

the config that we have in Gateway doesn't have the other fields that we want to use for UI to display for helper text / easily legible naming or information. They do contain the field mappings for validation, but I felt that it might be better to contain the full config for these special case providers in a single location

Understood. To share the context behind my question, we'll have two type of providers in our gateway: first tier support that MLflow owns schema unification, and second tier support that we'll use LiteLLM. For the first tier providers (openai, anthropic,...) we'll still reuse the existing gateway provider implementation and the per provider configuration should be consistent between the here and gateway provider implementation. Not a blocker, we can add more auth ways to the gateway implementation later.

@BenWilson2 BenWilson2 force-pushed the stack/endpoints/litellm branch 2 times, most recently from bc4092d to 728019d Compare December 10, 2025 20:10
Copy link
Collaborator

@TomeHirata TomeHirata left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@BenWilson2 BenWilson2 force-pushed the stack/endpoints/litellm branch from 728019d to 249b195 Compare December 11, 2025 04:14
@B-Step62
Copy link
Collaborator

B-Step62 commented Dec 11, 2025

@B-Step62 litellm responds with a list of all optional valid fields, but doesn't organize them into "these are the ones that are used together", hence the need to have the static mapping. It's not ideal, but it's better than having hard-mappings of all providers AND allows us to validate that the backend can support or not support a given provider (which is why for this version I surmised that allowing an editable connection configuration might be a bad idea for a random provider - a user wouldn't know until after having spent time configuring an endpoint that it's valid or not).

@BenWilson2 Hmm I'm still not following what is the intention of the current implementation.... according to the description you tried to add missing information (combination of optional fields), but if I read the logic of _get_credential_fields, we never use the hard-coded mapping in conjunction with the LiteLLM response.

If I read the code correctly, what current implementation does is

  1. Get field configuration from LiteLLM via get_provider_fields.
  2. If it returns some configuration, return it as is.
  3. Otherwise, fallback to the static config.

Am I missing something? I don't see _PROVIDER_CREDENTIAL_MAPPING used anywhere else.

    get_provider_fields = _get_provider_fields()Add a comment on  lines R610 to R616Add diff commentMarkdown input:  edit mode selected.WritePreviewAdd a suggestionHeadingBoldItalicQuoteCodeLinkUnordered listNumbered listTask listMentionReferenceSaved repliesAdd FilesPaste, drop, or click to add filesCancelCommentStart a reviewReturn to code
    provider_fields = get_provider_fields(provider)

    if provider_fields and len(provider_fields) > 0:
        return [
            {
                "name": field["field_name"],
                "type": field.get("field_type", "string"),
                "description": field.get("field_description", ""),
                "required": True,
            }
            for field in provider_fields
        ]
    elif provider in _PROVIDER_CREDENTIAL_MAPPING:
        return _PROVIDER_CREDENTIAL_MAPPING[provider]

    return []

P.S. Re-read your comment above, it seems we are talking about different mapping. What I referred to is _PROVIDER_CREDENTIAL_MAPPING. It seems you are talking about _PROVIDER_AUTH_MODES, and I understand we need the latter as discussed offline.

@BenWilson2 BenWilson2 force-pushed the stack/endpoints/litellm branch from 249b195 to 484ac23 Compare December 11, 2025 14:54
@BenWilson2
Copy link
Member Author

@B-Step62 litellm responds with a list of all optional valid fields, but doesn't organize them into "these are the ones that are used together", hence the need to have the static mapping. It's not ideal, but it's better than having hard-mappings of all providers AND allows us to validate that the backend can support or not support a given provider (which is why for this version I surmised that allowing an editable connection configuration might be a bad idea for a random provider - a user wouldn't know until after having spent time configuring an endpoint that it's valid or not).

@BenWilson2 Hmm I'm still not following what is the intention of the current implementation.... according to the description you tried to add missing information (combination of optional fields), but if I read the logic of _get_credential_fields, we never use the hard-coded mapping in conjunction with the LiteLLM response.

If I read the code correctly, what current implementation does is

  1. Get field configuration from LiteLLM via get_provider_fields.
  2. If it returns some configuration, return it as is.
  3. Otherwise, fallback to the static config.

Am I missing something? I don't see _PROVIDER_CREDENTIAL_MAPPING used anywhere else.

    get_provider_fields = _get_provider_fields()Add a comment on  lines R610 to R616Add diff commentMarkdown input:  edit mode selected.WritePreviewAdd a suggestionHeadingBoldItalicQuoteCodeLinkUnordered listNumbered listTask listMentionReferenceSaved repliesAdd FilesPaste, drop, or click to add filesCancelCommentStart a reviewReturn to code
    provider_fields = get_provider_fields(provider)

    if provider_fields and len(provider_fields) > 0:
        return [
            {
                "name": field["field_name"],
                "type": field.get("field_type", "string"),
                "description": field.get("field_description", ""),
                "required": True,
            }
            for field in provider_fields
        ]
    elif provider in _PROVIDER_CREDENTIAL_MAPPING:
        return _PROVIDER_CREDENTIAL_MAPPING[provider]

    return []

P.S. Re-read your comment above, it seems we are talking about different mapping. What I referred to is _PROVIDER_CREDENTIAL_MAPPING. It seems you are talking about _PROVIDER_AUTH_MODES, and I understand we need the latter as discussed offline.

WOW, yeah, I was 100% referring to the auth modes mapping. Sorry about that!
We're:

  1. Requiring litellm to be installed
  2. Needing the config to be compatible with the library interface

Having a 'safe fallback' is pointless and just is another vector for us to keep updated as new providers are added (or dropped).

@B-Step62 thanks for refocusing my attention on this. I'm going to remove the typed dict entry and the fallback mapping as it provides zero value and actually actively makes maintenance harder.

Thank you!

@BenWilson2 BenWilson2 force-pushed the stack/endpoints/litellm branch 2 times, most recently from 3a3246b to a73fbd3 Compare December 11, 2025 16:02
Signed-off-by: Ben Wilson <benjamin.wilson@databricks.com>
@BenWilson2 BenWilson2 force-pushed the stack/endpoints/litellm branch from a73fbd3 to 146bd37 Compare December 11, 2025 16:10
@BenWilson2 BenWilson2 added this pull request to the merge queue Dec 11, 2025
Merged via the queue into mlflow:master with commit d4bcdf2 Dec 11, 2025
62 of 70 checks passed
@BenWilson2 BenWilson2 deleted the stack/endpoints/litellm branch December 11, 2025 17:20
@@ -4110,9 +4110,11 @@ message CreateGatewaySecret {
optional string secret_value = 2;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@BenWilson2 I think we should update the schema for secret_value here to support arbitrary key-value pair?

@@ -22,7 +22,6 @@ def create_secret(
secret_name: str,
secret_value: str,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@BenWilson2 I think we should update the type hint here to str | dict[str, str]?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/tracking Tracking service, tracking client APIs, autologging rn/feature Mention under Features in Changelogs. team-review Trigger a team review request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants