Skip to content

Fix and simplify Gateway store interfaces#19346

Merged
BenWilson2 merged 4 commits intomlflow:masterfrom
BenWilson2:fix-rest-secret
Dec 12, 2025
Merged

Fix and simplify Gateway store interfaces#19346
BenWilson2 merged 4 commits intomlflow:masterfrom
BenWilson2:fix-rest-secret

Conversation

@BenWilson2
Copy link
Member

@BenWilson2 BenWilson2 commented Dec 12, 2025

🛠 DevTools 🛠

Open in GitHub Codespaces

Install mlflow from this PR

# mlflow
pip install git+https://github.com/mlflow/mlflow.git@refs/pull/19346/merge
# mlflow-skinny
pip install git+https://github.com/mlflow/mlflow.git@refs/pull/19346/merge#subdirectory=libs/skinny

For Databricks, use the following command:

%sh curl -LsSf https://raw.githubusercontent.com/mlflow/mlflow/HEAD/dev/install-skinny.sh | sh -s pull/19346/merge

Related Issues/PRs

#xxx

What changes are proposed in this pull request?

Fix an issue with update_secret store interfaces to support multiple key secrets (i.e., Bedrock, Azure).
Simplify the create and update secrets on all stores to only support dict[str, str] inputs.
Add store coverage for changes.
Add e2e multi-secret lifecycle test to verify proper create and update handling for compound key providers.

How is this PR tested?

  • Existing unit/integration tests
  • New unit/integration tests
  • Manual tests

Does this PR require documentation update?

  • No. You can skip the rest of this section.
  • Yes. I've updated:
    • Examples
    • API references
    • Instructions

Release Notes

Is this a user-facing change?

  • No. You can skip the rest of this section.
  • Yes. Give a description of this change to be included in the release notes for MLflow users.

What component(s), interfaces, languages, and integrations does this PR affect?

Components

  • area/tracking: Tracking Service, tracking client APIs, autologging
  • area/models: MLmodel format, model serialization/deserialization, flavors
  • area/model-registry: Model Registry service, APIs, and the fluent client calls for Model Registry
  • area/scoring: MLflow Model server, model deployment tools, Spark UDFs
  • area/evaluation: MLflow model evaluation features, evaluation metrics, and evaluation workflows
  • area/gateway: MLflow AI Gateway client APIs, server, and third-party integrations
  • area/prompts: MLflow prompt engineering features, prompt templates, and prompt management
  • area/tracing: MLflow Tracing features, tracing APIs, and LLM tracing functionality
  • area/projects: MLproject format, project running backends
  • area/uiux: Front-end, user experience, plotting, JavaScript, JavaScript dev server
  • area/build: Build and test infrastructure for MLflow
  • area/docs: MLflow documentation pages

How should the PR be classified in the release notes? Choose one:

  • rn/none - No description will be included. The PR will be mentioned only by the PR number in the "Small Bugfixes and Documentation Updates" section
  • rn/breaking-change - The PR will be mentioned in the "Breaking Changes" section
  • rn/feature - A new user-facing feature worth mentioning in the release notes
  • rn/bug-fix - A user-facing bug fix worth mentioning in the release notes
  • rn/documentation - A user-facing documentation change worth mentioning in the release notes

Should this PR be included in the next patch release?

Yes should be selected for bug fixes, documentation updates, and other small changes. No should be selected for new features and larger changes. If you're unsure about the release classification of this PR, leave this unchecked to let the maintainers decide.

What is a minor/patch release?
  • Minor release: a release that increments the second part of the version number (e.g., 1.2.0 -> 1.3.0).
    Bug fixes, doc updates and new features usually go into minor releases.
  • Patch release: a release that increments the third part of the version number (e.g., 1.2.0 -> 1.2.1).
    Bug fixes and doc updates usually go into patch releases.
  • Yes (this PR will be cherry-picked and included in the next patch release)
  • No (this PR will be included in the next minor release)

Signed-off-by: Ben Wilson <benjamin.wilson@databricks.com>
@BenWilson2 BenWilson2 requested review from TomeHirata and Copilot and removed request for Copilot December 12, 2025 01:57
@github-actions github-actions bot added v3.7.1 area/tracking Tracking service, tracking client APIs, autologging rn/feature Mention under Features in Changelogs. labels Dec 12, 2025
@github-actions
Copy link
Contributor

github-actions bot commented Dec 12, 2025

Documentation preview for aee88ed is available at:

More info
  • Ignore this comment if this PR does not change the documentation.
  • The preview is updated when a new commit is pushed to this PR.
  • This comment was created by this workflow run.
  • The documentation was built by this workflow run.

Signed-off-by: Ben Wilson <benjamin.wilson@databricks.com>
Copilot AI review requested due to automatic review settings December 12, 2025 03:05
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR simplifies and fixes the Gateway store interfaces for secret management by standardizing secret values to use only dict[str, str] format instead of str | dict[str, str]. This change enables proper support for multi-key secrets (e.g., AWS Bedrock credentials requiring both aws_access_key_id and aws_secret_access_key).

Key changes:

  • Updated all store interfaces to accept only dict[str, str] for secret values
  • Removed automatic string-to-dict conversion logic that used a default "api_key" key
  • Enhanced masking logic to display multiple secret field names for compound credentials
  • Added comprehensive test coverage for multi-key secret lifecycle operations

Reviewed changes

Copilot reviewed 9 out of 9 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
mlflow/store/tracking/gateway/abstract_mixin.py Updated interface signatures to require dict[str, str] for secret_value parameters and improved documentation
mlflow/store/tracking/gateway/sqlalchemy_mixin.py Removed string-to-dict conversion logic, enhanced masking for multi-key secrets, simplified create/update implementations
mlflow/store/tracking/gateway/rest_mixin.py Updated REST store to serialize dict values to JSON strings before API calls
mlflow/server/handlers.py Added _parse_secret_value function to convert JSON strings to dicts with validation
tests/tracking/test_rest_tracking.py Updated all test cases to use dict format and added comprehensive multi-key secret tests
tests/store/tracking/test_secret_cache.py Updated secret cache tests to use dict format for secret values
tests/store/tracking/test_rest_store.py Updated REST store tests to use dict format and added multi-key secret test cases
tests/store/tracking/test_gateway_sql_store.py Updated all SQL store tests to use dict format for secret values
tests/server/test_handlers.py Added unit tests for the new _parse_secret_value function

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +647 to +659
if isinstance(parsed, dict):
return parsed
raise MlflowException(
"secret_value must be a JSON object (dict)",
error_code=INVALID_PARAMETER_VALUE,
)
except json.JSONDecodeError as e:
raise MlflowException(
f"secret_value must be valid JSON: {e}",
error_code=INVALID_PARAMETER_VALUE,
) from e


Copy link

Copilot AI Dec 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The _parse_secret_value function should validate that all dictionary values are strings and that the dictionary is non-empty. Currently, it only checks that the parsed JSON is a dict, but doesn't validate the dict's content matches the expected dict[str, str] type signature. This could allow invalid inputs like empty dicts, dicts with non-string values (numbers, booleans, nested objects, None), which would cause errors later in the encryption/storage logic.

Suggested change
if isinstance(parsed, dict):
return parsed
raise MlflowException(
"secret_value must be a JSON object (dict)",
error_code=INVALID_PARAMETER_VALUE,
)
except json.JSONDecodeError as e:
raise MlflowException(
f"secret_value must be valid JSON: {e}",
error_code=INVALID_PARAMETER_VALUE,
) from e
if not isinstance(parsed, dict):
raise MlflowException(
"secret_value must be a JSON object (dict)",
error_code=INVALID_PARAMETER_VALUE,
)
if not parsed:
raise MlflowException(
"secret_value must be a non-empty JSON object (dict)",
error_code=INVALID_PARAMETER_VALUE,
)
for k, v in parsed.items():
if not isinstance(k, str) or not isinstance(v, str):
raise MlflowException(
"All keys and values in secret_value must be strings.",
error_code=INVALID_PARAMETER_VALUE,
)
return parsed
except json.JSONDecodeError as e:
raise MlflowException(
f"secret_value must be valid JSON: {e}",
error_code=INVALID_PARAMETER_VALUE,
) from e

Copilot uses AI. Check for mistakes.
The created GatewaySecretInfo object with masked value.
"""
secret_value_str = json.dumps(secret_value)
auth_config_json = json.dumps(auth_config) if auth_config else None
Copy link

Copilot AI Dec 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The condition if auth_config on line 110 will evaluate to False for an empty dict ({}), which is inconsistent with the SQLAlchemy implementation. In sqlalchemy_mixin.py, an empty dict for auth_config is explicitly used to clear the configuration (see line 253-254 in sqlalchemy_mixin.py where the comment states "Empty dict {} explicitly clears auth_config"). Using if auth_config is not None would make this consistent with the SQL implementation and allow empty dicts to be serialized as "{}".

Suggested change
auth_config_json = json.dumps(auth_config) if auth_config else None
auth_config_json = json.dumps(auth_config) if auth_config is not None else None

Copilot uses AI. Check for mistakes.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Valid! Will update

The updated GatewaySecretInfo object with masked value.
"""
secret_value_str = json.dumps(secret_value) if secret_value else ""
auth_config_json = json.dumps(auth_config) if auth_config else None
Copy link

Copilot AI Dec 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The condition if auth_config on line 166 will evaluate to False for an empty dict ({}), which is inconsistent with the SQLAlchemy implementation. In sqlalchemy_mixin.py, an empty dict for auth_config is explicitly used to clear the configuration (see line 253-254 in sqlalchemy_mixin.py where the comment states "Empty dict {} explicitly clears auth_config"). Using if auth_config is not None would make this consistent with the SQL implementation and allow empty dicts to be serialized as "{}".

Suggested change
auth_config_json = json.dumps(auth_config) if auth_config else None
auth_config_json = json.dumps(auth_config) if auth_config is not None else None

Copilot uses AI. Check for mistakes.
@TomeHirata
Copy link
Collaborator

Shouldn't we update the proto for CreateGatewaySecret and UpdateGatewaySecret to support key-value pairs in REST API?

@BenWilson2
Copy link
Member Author

Shouldn't we update the proto for CreateGatewaySecret and UpdateGatewaySecret to support key-value pairs in REST API?

I was weighing that earlier about whether we want to leave it 'open' as JSON serialized string but the I should have just gone with my first instinct and just passed JSON.
I agree... it'll remove the serialization / deserialization that we're doing between the stores right now and simplify the handlers. I'll make the change :)

Signed-off-by: Ben Wilson <benjamin.wilson@databricks.com>
Copy link
Collaborator

@TomeHirata TomeHirata left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Signed-off-by: Ben Wilson <benjamin.wilson@databricks.com>
@BenWilson2 BenWilson2 enabled auto-merge December 12, 2025 04:10
@BenWilson2 BenWilson2 added this pull request to the merge queue Dec 12, 2025
Merged via the queue into mlflow:master with commit 7f5fcaa Dec 12, 2025
54 checks passed
@BenWilson2 BenWilson2 deleted the fix-rest-secret branch December 12, 2025 04:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/tracking Tracking service, tracking client APIs, autologging rn/feature Mention under Features in Changelogs. v3.7.1

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants