Skip to content

Add validation to inference update operation#140003

Merged
Jan-Kazlouski-elastic merged 138 commits intoelastic:mainfrom
Jan-Kazlouski-elastic:add-update-validation
Feb 2, 2026
Merged

Add validation to inference update operation#140003
Jan-Kazlouski-elastic merged 138 commits intoelastic:mainfrom
Jan-Kazlouski-elastic:add-update-validation

Conversation

@Jan-Kazlouski-elastic
Copy link
Copy Markdown
Contributor

This PR fixes: #122356
Validation is added to Update action.

#137241 (comment) suggests 3 options for fixing this.

  1. Converting the Model back to an UnparsedModel. Not implemented because Model fields are immutable and cannot be changed.
  2. Run the updates on the various settings while they are still unparsed. Not implemented because in that case validation methods such as newSecretSettings and updateServiceSettings, updateTaskSettings are not called that when combining, so we'd have to create two parsed Models instead of one. Logic of dealing with maps proved to be very complex and risky.
  3. Move combineExistingModelWithNewSettings into a more service specific location.

Third approach was implemented by combining of configs and secrets was separated into 2 methods that return ModelConfigurations and ModelSecrets respectively, by calling specially implemented methods. Based on those ModelConfigurations and ModelSecrets - by introducing new method for building a Model to InferenceService interface that uses provider specific constructors we're able to perform the validation using the updated Model that is going to be replacing the existing one.

  • - Have you signed the contributor license agreement?
  • - Have you followed the contributor guidelines?
  • - If submitting code, have you built your formula locally prior to submission with gradle check?
  • - If submitting code, is your pull request against main? Unless there is a good reason otherwise, we prefer pull requests against main and will backport as needed.
  • - If submitting code, have you checked that your submission is for an OS and architecture that we support?
  • - If you are submitting this code for a class then read our policy for that.

@elasticsearchmachine elasticsearchmachine added v9.4.0 external-contributor Pull request authored by a developer outside the Elasticsearch team labels Dec 25, 2025
@Jan-Kazlouski-elastic Jan-Kazlouski-elastic marked this pull request as ready for review December 30, 2025 21:36
@elasticsearchmachine elasticsearchmachine added the needs:triage Requires assignment of a team area label label Dec 30, 2025
@PeteGillinElastic PeteGillinElastic added the :SearchOrg/Inference Label for the Search Inference team label Jan 5, 2026
@elasticsearchmachine elasticsearchmachine added Team:Search - Inference and removed needs:triage Requires assignment of a team area label labels Jan 5, 2026
@elasticsearchmachine
Copy link
Copy Markdown
Collaborator

Pinging @elastic/search-inference-team (Team:Search - Inference)

@DonalEvans DonalEvans self-assigned this Jan 7, 2026
@DonalEvans DonalEvans self-requested a review January 7, 2026 22:11
@DonalEvans DonalEvans added the >bug label Jan 7, 2026
@DonalEvans
Copy link
Copy Markdown
Contributor

@Jan-Kazlouski-elastic Please could you add an appropriate changelog file at docs/changelog/140003.yaml? Thanks!

Copy link
Copy Markdown
Contributor

@DonalEvans DonalEvans left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There seems to be a lot of duplication of logic between the buildModelFromConfigAndSecrets() implementations and the existing logic in parseRequestConfig() for each service class. Would it be possible to extract the common code into methods that can be called from either buildModelFromConfigAndSecrets() or parseRequestConfig() ?

Also, there are no tests added in this PR, despite a lot of behaviour being changed. Could you please add unit tests for each of the buildModelFromConfigAndSecrets() implementations, and if possible, tests that cover the new behaviour in TransportUpdateInferenceModelAction, showing that attempting to update the model to be invalid causes an error and the model not to be updated, but updating to a valid model succeeds.

Comment on lines +175 to +183
var serviceSettings = config.getServiceSettings();
var secretSettings = secrets.getSecretSettings();

return new DeepSeekChatCompletionModel(
(DeepSeekChatCompletionModel.DeepSeekServiceSettings) serviceSettings,
(DefaultSecretSettings) secretSettings,
config,
new ModelSecrets(secretSettings)
);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For consistency with other services, there should be a check that the taskType passed in is supported here.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

taskType,
NAME,
(NvidiaChatCompletionServiceSettings) serviceSettings,
(DefaultSecretSettings) secretSettings
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These casts to DefaultSecretSettings can be removed if the constructors for the Nvidia*Model classes are left as they were before this PR.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

taskType,
NAME,
(OpenShiftAiChatCompletionServiceSettings) serviceSettings,
(DefaultSecretSettings) secretSettings
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These casts to DefaultSecretSettings can be removed if the constructors for the OpenShift models are left as they were before this PR.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Comment on lines 238 to 248
TaskSettings existingTaskSettings = existingConfigs.getTaskSettings();
SecretSettings existingSecretSettings = existingParsedModel.getSecretSettings();

SecretSettings newSecretSettings = existingSecretSettings;
TaskSettings newTaskSettings = existingTaskSettings;
ServiceSettings newServiceSettings = existingConfigs.getServiceSettings();

if (settingsToUpdate.serviceSettings() != null && existingSecretSettings != null) {
newSecretSettings = existingSecretSettings.newSecretSettings(settingsToUpdate.serviceSettings());
}
if (settingsToUpdate.serviceSettings() != null) {
newServiceSettings = newServiceSettings.updateServiceSettings(settingsToUpdate.serviceSettings());
if (newSettings.serviceSettings() != null) {
newServiceSettings = newServiceSettings.updateServiceSettings(newSettings.serviceSettings());
}
if (settingsToUpdate.taskSettings() != null && existingTaskSettings != null) {
newTaskSettings = existingTaskSettings.updatedTaskSettings(settingsToUpdate.taskSettings());
}

if (existingParsedModel.getTaskType().equals(resolvedTaskType) == false) {
throw new ElasticsearchStatusException("Task type must match the task type of the existing endpoint", RestStatus.BAD_REQUEST);
if (newSettings.taskSettings() != null && existingTaskSettings != null) {
newTaskSettings = existingTaskSettings.updatedTaskSettings(newSettings.taskSettings());
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would be a little more consistent as:

TaskSettings existingTaskSettings = existingConfigs.getTaskSettings();
ServiceSettings existingServiceSettings = existingConfigs.getServiceSettings();

TaskSettings newTaskSettings = existingTaskSettings;
ServiceSettings newServiceSettings = existingServiceSettings;

if (newSettings.serviceSettings() != null) {
    newServiceSettings = newServiceSettings.updateServiceSettings(newSettings.serviceSettings());
}
if (newSettings.taskSettings() != null && existingTaskSettings != null) {
    newTaskSettings = newTaskSettings.updatedTaskSettings(newSettings.taskSettings());
}

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. Done.

@Jan-Kazlouski-elastic
Copy link
Copy Markdown
Contributor Author

@Jan-Kazlouski-elastic Please could you add an appropriate changelog file at docs/changelog/140003.yaml? Thanks!

Added.

@Jan-Kazlouski-elastic
Copy link
Copy Markdown
Contributor Author

Apologies if they've been added and I've missed them, but we definitely need a test (possibly in InferenceCrudIT, but I could also see an argument for creating an update-specific test class) for the chunking settings being overwritten bug, and tests that confirm that we validate the model when doing an update, both successful and unsuccessful. I think it will be necessary to modify the Test*ServiceExtension classes to allow validation to fail, because right now as long as the task type is correct, inference calls always succeed on the test services.

Hello @DonalEvans @jonathan-buttner
PR has been updated with unit tests added in TransportUpdateInferenceModelActionTests and InferenceCrudIT and is ready for another round of review.
The validation failures in InferenceCrudIT are triggered by validation_should_fail boolean task setting.

Copy link
Copy Markdown
Contributor

@DonalEvans DonalEvans left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work expanding the unit test coverage for TransportUpdateInferenceModelAction!

Comment on lines +127 to +130
if (Objects.equals(
((TestRerankingServiceExtension.TestTaskSettings) model.getTaskSettings()).shouldFailValidation(),
Boolean.TRUE
)) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick, but this can be made a little more readable by changing it to:

if (((TestRerankingServiceExtension.TestTaskSettings) model.getTaskSettings()).shouldFailValidation()) {

since the TestRerankingServiceExtension version of TestTaskSettings.shouldFailValidation() returns a primitive boolean which can never be null.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. Fixed.

ResponseException.class,
() -> updateEndpoint("sparse_embedding_model", updateConfigWithFailedValidationFlag(SPARSE_EMBEDDING))
);
assertThat(e.getMessage(), containsString("validation call intentionally failed based on task settings"));
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In addition to asserting that the validation call fails, these tests should also check that the model was not actually updated by getting the settings and confirming that they were not changed, since that's what we really care about here.

As a side note, I tried to copy what the testCRUD() method was doing to check the api_key setting (starting on line 102), but it seems like the test is flawed, because the map returned by getModels() does not contain an entry for "api_key", because we don't return the API key with the other service settings. That test also incorrectly compares the old API key value from the model with the local variable used to set the new API key, not the value returned from the model after updating it, which is why the test is able to pass despite the field not being present.

I don't think we actually have a way to confirm that the API key was updated, since there's no way to get the API key associated with an endpoint (as far as I know), so we'll have to confirm that the settings as a whole weren't updated instead, something like this:

putModel("sparse_embedding_model", mockSparseServiceModelConfig(), SPARSE_EMBEDDING);
var originalModel = getModel("sparse_embedding_model");
var e = expectThrows(
    ResponseException.class,
    () -> updateEndpoint("sparse_embedding_model", updateConfigWithFailedValidationFlag(SPARSE_EMBEDDING))
);
assertThat(e.getMessage(), containsString("validation call intentionally failed based on task settings"));
var modelAfterUpdate = getModel("sparse_embedding_model");
assertThat(modelAfterUpdate, is(originalModel));

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added the proposed checks. Thank you.

@@ -71,7 +71,7 @@ private Model storeWorkingEndpoint(String id) {
TaskType.SPARSE_EMBEDDING,
"test_service",
new TestSparseInferenceServiceExtension.TestServiceSettings("model", null, false),
new AbstractTestInferenceService.TestTaskSettings(randomInt(3)),
new AbstractTestInferenceService.TestTaskSettings(randomInt(3), randomBoolean()),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to avoid potential confusion, it would be better to explicitly pass false in here and on line 94, even though the tests pass with either value because no validation is performed.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Comment on lines +128 to +134
var exceptionMessage = "Model not found";
mockGetModelWithSecretsToFailWithException(new RuntimeException(exceptionMessage));

var listener = callMasterOperationWithActionFuture();

var exception = expectThrows(RuntimeException.class, () -> listener.actionGet(TimeValue.timeValueSeconds(5)));
assertThat(exception.getMessage(), sameInstance(exceptionMessage));
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test could be made slightly stronger by asserting that the exception that gets thrown is the same instance, not that the message is the same instance:

var expectedException = new RuntimeException("Model not found");
mockGetModelWithSecretsToFailWithException(expectedException);

var listener = callMasterOperationWithActionFuture();

var exception = expectThrows(RuntimeException.class, () -> listener.actionGet(TimeValue.timeValueSeconds(5)));
assertThat(exception, sameInstance(expectedException));

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. Done.


var listener = callMasterOperationWithActionFuture();

var exception = expectThrows(ElasticsearchStatusException.class, () -> listener.actionGet(TimeValue.timeValueSeconds(5)));
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unless there's a specific reason to use a shorter timeout, we should be using ESTestCase.TEST_REQUEST_TIMEOUT instead of 5 seconds in these tests. Using too short a timeout runs the risk of flaky failures if the hardware the test runs on is slow or there's a big GC just at the wrong time or something.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. I just used the one that was there for this test file already. Replaced it with the common timeout value. Good catch.

assertThat(exception.getMessage(), is("Failed to update model, updated model not found"));
}

public void testMasterOperation_GetModelThrownException_ThrowsElasticsearchStatusException() {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test name doesn't match what the test is asserting, since the exception thrown is a RuntimeException, not an ElasticsearchStatusException

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. Improved the test with sameInstance assertion for the exception thrown.

Comment on lines +474 to +486
verify(model).getConfigurations();
verify(model).getInferenceEntityId();
verify(model).getTaskType();
verifyNoMoreInteractions(model);

verify(model.getConfigurations()).getServiceSettings();
verify(model.getConfigurations()).getTaskSettings();
verify(model.getConfigurations()).getChunkingSettings();

verifyNoInteractions(serviceSettings);
verifyNoInteractions(taskSettings);
verifyNoInteractions(model.getConfigurations().getChunkingSettings());
verifyNoInteractions(secretSettings);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this verification necessary? We only really care that the configs aren't updated, not whether we call these getters, since getters should have no effect on anything.

There's quite a lot of unnecessary verifications in these tests, so I won't comment on every single one, but in general, verification should only check things that we expect to have an impact on behaviour/state, so methods with no side effects like getters don't need to be verified. In most of these tests, simply confirming that the state of the configurations matches what's expected is enough.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Comment on lines +516 to +534
verify(model).getConfigurations();
verify(model).getInferenceEntityId();
verify(model).getTaskType();
verifyNoMoreInteractions(model);

verify(model.getConfigurations()).getServiceSettings();
verify(model.getConfigurations()).getTaskSettings();
verify(model.getConfigurations()).getChunkingSettings();

verify(originalServiceSettings).updateServiceSettings(newServiceSettingsMap);
verifyNoMoreInteractions(originalServiceSettings);

verify(originalTaskSettings).updatedTaskSettings(newTaskSettingsMap);
verifyNoMoreInteractions(originalTaskSettings);

verifyNoInteractions(updatedServiceSettings);
verifyNoInteractions(updatedTaskSettings);
verifyNoInteractions(model.getConfigurations().getChunkingSettings());
verifyNoInteractions(secretSettings);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other than the verification that we call updateServiceSettings() and updatedTaskSettings() (methods that affect the state of the objects they're called on), I don't think we need all this verification.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough. Adding verification for all of the calls was an overkill. I left only updateServiceSettings, updatedTaskSettings and newSecretSettings ones + the ones that validate that no calls were registered when they are not expected.

Comment on lines +536 to +541
assertThat(resultModelConfigurations.getInferenceEntityId(), sameInstance(model.getInferenceEntityId()));
assertThat(resultModelConfigurations.getTaskType(), sameInstance(model.getTaskType()));
assertThat(resultModelConfigurations.getService(), sameInstance(SERVICE_NAME_VALUE));
assertThat(resultModelConfigurations.getServiceSettings(), sameInstance(updatedServiceSettings));
assertThat(resultModelConfigurations.getTaskSettings(), sameInstance(updatedTaskSettings));
assertThat(resultModelConfigurations.getChunkingSettings(), sameInstance(model.getConfigurations().getChunkingSettings()));
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sameInstance() checks should be used when we want to confirm that some object is the same in-memory object as some other object. It's a stronger check than simply checking equality, because two objects can be equal but not the same instance. However, for things like Strings, enum values and some primitives, this check is not really appropriate, since two Strings with the same value will end up being converted to the same in-memory object in Java, so if the Strings are equal, they will necessarily also be the same instance.

For example, this test passes:

String aString = "value";
String anotherString = "value";
assertThat(aString, sameInstance(anotherString));

whereas this one fails, even though .equals() would return true when comparing the two lists:

var aList = List.of("value");
var anotherList = List.of("value");
assertThat(aList, sameInstance(anotherList));

Copy link
Copy Markdown
Contributor Author

@Jan-Kazlouski-elastic Jan-Kazlouski-elastic Jan 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you are correct. That is the reason sameInstance check is used here for comparing whether or not checked variables are pointing towards the same object in the memory and not just equal to each other. It IS intentional.

if the Strings are equal, they will necessarily also be the same instance

I would disagree. For the strings there are cases when 2 strings while being equal to each other are not pointing towards the same object. Per example new String("value") would be equal but not the same as "value". Or in case of runtime concatenation:

String base = "va";
String s1 = "value";
String s2 = base + "lue";

assertThat(s1, sameInstance(s2)); // fails

Only usage of .intern() would make those variables point to the same object.

In these checks it is used to stay consistent with other checks because we DO expect same object to be used.
Do you want me to change it to is()? That wouldn't affect the result but would relax the checks. I wouldn't want to do that. The risk of meeting these cases when equals != same is minimal but not 0%. So I would keep it as is. Because in the logic - same String object is used. It just keeps things more understandable.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the strings there are cases when 2 strings while being equal to each other are not pointing towards the same object.

Very true! I'd forgotten about those cases, my statement was definitely too broad. I'm fine with leaving these assertions as they are, but I think that while sameInstance() is a stronger check, it's not always necessary to use it.

My philosophy with test assertions is that they should assert the expected behaviour rather than the implementation details, so while it's true that the String values are the same instance here, we don't actually care if they are or not, from a behaviour point of view. It's like using the InOrder interface from Mockito when doing verification; it's a stronger check to assert that methods are called in a particular order, but we only use that stronger check when it's really relevant to the behaviour of the method we're testing. If for some reason the implementation changed so that the String returned from resultModelConfigurations.getInferenceEntityId() was not the same instance as returned from model.getInferenceEntityId(), would we actually care? It would be a small inefficiency in the code, but probably not worth failing a test over.

All that being said, I'm fine with this as it is, it's just a personal preference in terms of testing.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perfectly understandable. In this particular case usage of the method that was used for other fields seemed OK, so let's keep it that way. Overall - very useful discussion. Thank you.

var model = createMockedModel(serviceSettings, taskSettings, null);
var modelSecrets = action.combineExistingSecretsWithNewSecrets(model, newSecretsMap);

assertThat(modelSecrets.getSecretSettings(), nullValue());
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Related to this comment, I don't think this is the behaviour we want in this scenario. If there are no existing secret settings and a user wants to add some (and, somehow, both those scenarios are valid for the service they're using) then I think we should either use the new secret settings instead of ignoring them, or return an error to let the user know that the update they tried to perform failed.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The case described fits one of the issues discussed earlier in slack thread. I know that with 10k+ changed lines it might sound strange but I would want to a minimum the actual changes to the behavior. Right now the actual behavior changed in whether or not validation call to the provider is done and chunking settings are not lost anymore.

With understanding of your comment I would love to address it in a separate PR (for #140888 since it anyway will require changes to the logic of secret settings update) along with this test. Is that ok with you?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, addressing this in another PR is absolutely fine

@Jan-Kazlouski-elastic
Copy link
Copy Markdown
Contributor Author

Hello @DonalEvans! Thank you for your comments. All of them are addressed. Would you please give this PR another look?

@Jan-Kazlouski-elastic Jan-Kazlouski-elastic enabled auto-merge (squash) February 2, 2026 18:00
@Jan-Kazlouski-elastic Jan-Kazlouski-elastic merged commit 51b0181 into elastic:main Feb 2, 2026
37 checks passed
lukewhiting pushed a commit to lukewhiting/elasticsearch that referenced this pull request Feb 3, 2026
…40003)

* Refactor model constructors to public visibility for accessibility

* Add buildModelFromConfigAndSecrets method to inference services

* Enhance model update process with validation and configuration handling

* Implement buildModelFromConfigAndSecrets method for enhanced model creation

* [CI] Auto commit changes from spotless

* Add clarification to model creation method documentation

* Remove testing usage comments from model constructors

* Refactor model instantiation to include null for optional parameters

* Add changelog

* Refactor task type handling in DeepSeekChatCompletionModel instantiation

* Refactor secret settings to use SecretSettings in model instantiations

* Refactor model instantiation to use SecretSettings instead of DefaultSecretSettings

* Refactor model instantiation to use SecretSettings instead of DefaultSecretSettings

* Refactor model constructors to accept ModelConfigurations and ModelSecrets directly

* Refactor Google Vertex AI model constructors to accept ModelConfigurations and ModelSecrets directly

* Update model validation timeout to 30 seconds in TransportUpdateInferenceModelAction

* Fix changelog

* Fix naming in TransportUpdateInferenceModelAction

* Refactor AzureAiStudioService to use switch expressions for model instantiation

* Add model creators for AlibabaCloudSearchService to streamline model instantiation

* Refactor model constructors to use ModelConfigurations for improved instantiation

* Refactor AzureAiStudio models to use dedicated model creators for improved instantiation

* Refactor AlibabaCloudSearchModelCreator interface for improved organization

* Refactor Ai21Service and AnthropicService to use dedicated model creators for improved instantiation

* Refactor AmazonBedrock models to use dedicated model creators for improved instantiation

* Refactor Ai21Service to use a map of model creators for improved task handling

* Refactor CohereService and ContextualAiService to use dedicated model creators for improved instantiation

* Refactor DeepSeekService to improve task type validation in model instantiation

* Refactor ElasticInferenceService to use model creators for improved task handling

* Refactor ElasticInferenceService and GoogleAiStudioService to utilize model creators for improved task handling and validation

* Refactor GoogleVertexAiService to implement model creators for enhanced task handling and validation

* Refactor GroqService and GroqChatCompletionModel to implement model creators for improved task handling and validation

* Refactor HuggingFaceElserModel and HuggingFaceElserService to improve task handling and validation

* Refactor HuggingFaceService to implement model creators for improved task handling and validation

* Refactor Ai21 model handling to implement model creators for improved task management and validation

* Refactor Ai21 model handling to update package structure and improve organization

* Refactor Ai21 model handling to implement a generic ModelFactory and improve model creation logic

* Add Alibaba Cloud Search model creators for completion, embeddings, rerank, and sparse models

* Add Amazon Bedrock model creators for chat completion and embeddings

* Add Anthropic model factory and chat completion model creator

* Add Azure AI Studio model creators for chat completion, embeddings, and rerank

* Refactor Azure OpenAI model handling to implement a ModelFactory and separate model creators for completion and embeddings

* Add Cohere model creators and factory for completion, embeddings, and rerank

* Add ContextualAiModelFactory and ContextualAiRerankModelCreator for reranking tasks

* Add CustomModelCreator and CustomModelFactory for custom model instantiation

* Add DeepSeek model creator and factory for chat completion

* Refactor ElasticInferenceService to use ModelFactory for model creation and add specific model creators for completion, dense text embeddings, and sparse embeddings

* Add Elastic model creators and factory

* Add Google AI Studio model creators and factory for completion and embeddings

* Add Google Vertex AI model creators and factory for chat completion, embeddings, and reranking

* Add Groq model creators and factory for chat completion

* Add Hugging Face model creators and factory for chat completion, embeddings, and reranking

* Add IBM Watsonx model creators and factory for chat completion, embeddings, and reranking

* Add Jina AI model creators and factory for embeddings and reranking

* Add Llama model creators and factory for chat completion and embeddings

* Add Mistral model creators and factory for chat completion and embeddings

* Add Nvidia model creators and refactor model construction for completion, embeddings, and reranking

* Add OpenAI model creators and refactor model construction for chat completion and embeddings

* Add OpenShift AI model creators and refactor model construction for chat completion, embeddings, and reranking

* Refactor SageMaker model construction to use fromStorage method with ModelConfigurations and ModelSecrets

* Refactor Voyage AI model constructors to use ModelConfigurations and ModelSecrets

* Refactor JinaAIEmbeddingsModel to use inferenceId and update model creators for consistency

* Remove misleading null check

* Add issue number to changelog

* Add model configuration and secrets handling to inference services tests

* Enhance task settings handling in inference service tests

* Refactor AlibabaCloudSearchServiceTests to use constants for service settings and enhance model validation tests

* Add method to retrieve ModelCreator from map with exception handling for invalid task types

* Add model creators for Ai21, AlibabaCloudSearch, and AmazonBedrock services to enhance model retrieval and configuration handling

* Refactor inference services to utilize ModelCreator maps for model instantiation and enhance validation handling

* Add model creators for Elasticsearch and implement validation for model IDs in inference service

* Add model creators for Google AI Studio and refactor model retrieval logic

* Add model creators for Jina, Llama, Mistral, Nvidia, OpenAi, OpenShift, and Voyage AI services to enhance model retrieval and validation handling

* Refactor AmazonBedrockServiceTests to use constants for configuration values and enhance model validation

* Refactor AnthropicServiceTests to use constants for model configurations and enhance validation handling

* Refactor service tests to use 'var' for local variable declarations and enhance model validation handling

* Enhance AzureOpenAiServiceTests with constant values for configuration and improve validation handling

* Enhance CohereServiceTests with model building tests and validation handling

* Enhance ContextualAiServiceTests with model building tests and validation handling

* Refactor ContextualAiServiceTests to use CoreMatchers for improved readability

* Enhance CustomServiceTests with model building tests and validation handling

* Enhance DeepSeekServiceTests with model building tests and validation handling

* Refactor model creation tests to use helper functions for improved readability

* Update AlibabaCloudSearchServiceTests to use completion service settings and task settings

* Refactor inference service tests to improve task type handling and enhance readability

* Enhance ElasticInferenceServiceTests with model building tests and validation handling

* Add TaskType parameter to AmazonBedrock model creation methods for completion requests

* Refactor inference service tests to unify persisted config handling

* Add test for building model from config and secrets for chat completion

* Refactor ElasticInferenceService model creation to include TaskType parameter

* Enhance ElasticRerankerServiceSettingsTests and ElasticsearchInternalServiceTests to support model ID and validate model building from configurations and secrets

* Add tests for building models from configurations and secrets in GoogleAiStudioServiceTests

* Add tests for building models from configurations and secrets in GoogleVertexAiServiceTests

* Add tests for building models from configurations and secrets in GroqServiceTests

* Add tests for building models from configurations and secrets in HuggingFaceElserServiceTests and HuggingFaceServiceTests

* Add tests for building models from configurations and secrets in IbmWatsonxChatCompletionActionTests, IbmWatsonxChatCompletionModelTests, and IbmWatsonxServiceTests

* Add tests for building models from configurations and secrets in JinaAIServiceTests

* Add tests for building models from configurations and secrets in MistralServiceTests

* Add tests for building models from configurations and secrets in SageMakerServiceTests

* Add tests for building models from configurations and secrets in VoyageAIServiceTests

* Change method visibility to protected for model validation and configuration methods in TransportUpdateInferenceModelAction

* Add tests for updating inference model configurations and secrets in TransportUpdateInferenceModelActionTests

* Remove unused imports and parser method in TransportUpdateInferenceModelAction

* Add tests for master operation error handling in TransportUpdateInferenceModelActionTests

* Add validation failure handling for inference model updates

* Simplify validation failure check in TestRerankingServiceExtension

* Refactor test task settings to remove random boolean for consistency

* Refactor exception handling in TransportUpdateInferenceModelActionTests for clarity

* Update timeout handling in TransportUpdateInferenceModelActionTests for consistency

* Enhance validation checks in TransportUpdateInferenceModelActionTests to ensure registry model update operations are correctly verified

* Refactor exception handling in TransportUpdateInferenceModelActionTests for clarity and consistency

* Refactor TransportUpdateInferenceModelActionTests to simplify model mocking and remove redundant verifications

* Enhance validation checks in InferenceCrudIT to ensure original model state is preserved after failed updates

---------

Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co>
dimitris-athanasiou added a commit to dimitris-athanasiou/elasticsearch that referenced this pull request Feb 12, 2026
…ts on update

This addresses a bug introduced in elastic#140003 which causes service settings
for elasticsearch internal inference endpoints to lose their subclass information
and remain only as the `ElasticsearchInternalServiceSettings` parent class.
dimitris-athanasiou added a commit that referenced this pull request Feb 12, 2026
…ts on update (#142380)

This addresses a bug introduced in #140003 which causes service settings
for elasticsearch internal inference endpoints to lose their subclass information
and remain only as the `ElasticsearchInternalServiceSettings` parent class.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

>bug external-contributor Pull request authored by a developer outside the Elasticsearch team :SearchOrg/Inference Label for the Search Inference team Team:Search - Inference v9.4.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[ML] Inference update API does not validate the updated settings

6 participants