[ML] Refactor ML mappings and templates into JSON resources#51765
Conversation
|
Pinging @elastic/ml-core (:ml) |
There was a problem hiding this comment.
@gwbrown Here are the changes I made to allow variable replacement on template loading. Could you take a look please when you have some time?
There was a problem hiding this comment.
I have not replaced this with returning an operator that does nothing yet because it is unclear whether IndexTemplateRegistry fully addresses upgrading issues. I'm waiting to hear on this, until then the PR is WIP.
There was a problem hiding this comment.
These tests are unfortunately the result of extreme mocking, something we tried in the past but doesn't seem to bring benefits. I have removed them and replaced them with integration tests in JobResultsProviderIT.
There was a problem hiding this comment.
I just noticed this one, I forgot to address it. I'll take care of it before removing the WIP label.
There was a problem hiding this comment.
In hindsight I'm not sure this mock client class was useful. I remove all unused methods.
There was a problem hiding this comment.
This template's name is formed by the inference prefix plus the latest version. This would mean that once we change version we'll create a new template. I think that's probably not a good idea. I left it as is for now but we can address in this PR. In fact, our template naming is a bit inconsistent but I'm not sure we can iron it easily. I'll make sure we discuss this.
AthenaEryma
left a comment
There was a problem hiding this comment.
Left a couple comments about escaping in the changes to TemplateUtils/IndexTemplateConfig, but I think once those are addressed the additional variable support will be valuable.
x-pack/plugin/core/src/main/java/org/elasticsearch/xpack/core/template/IndexTemplateConfig.java
Outdated
Show resolved
Hide resolved
x-pack/plugin/core/src/main/java/org/elasticsearch/xpack/core/template/TemplateUtils.java
Outdated
Show resolved
Hide resolved
droberts195
left a comment
There was a problem hiding this comment.
I saw a few nits plus there's a question of what we should do when the plugin is disabled.
Let me know if the test failures are showing up any unforeseen edge cases that need talking through.
There was a problem hiding this comment.
For the templates we decided to leave this out in the future. Can it be omitted here too now we're explicitly setting it in the affected tests? (Plus the code above that looks for whether we're using the black hole autodetect process.)
There was a problem hiding this comment.
We're using the same state index for data frame analytics state too, so maybe drop the ANOMALY_DETECTION_ bit?
There was a problem hiding this comment.
I'd like to move all state-related logic in AnomalyDetectorsIndex into a MlStateIndex class. But I'd rather do that in a follow-up PR.
There was a problem hiding this comment.
This is a difference compared to what we did with the old getIndexTemplateMetaDataUpgrader() functionality. In that we used to return our templates even when ML was disabled on the node. Returning nothing when ML is disabled isn't necessarily wrong, but it's a difference that needs thinking about.
There may be some subtle effect in Cloud for example. I think at one time there was a proposal that when a Cloud cluster gets switched to different hardware and a full cluster snapshot from the old hardware is restored into brand new empty cluster running on different hardware that all the X-Pack plugins are disabled during that restore. I am not sure if this is still being proposed or not. But it's a scenario where the ML plugin would be disabled in a cluster that contained ML indices.
Let's consult more widely on this offline.
There was a problem hiding this comment.
It was wrong to return an empty list. A number of tests was failing because of that. As long as the plugin is loaded, the templates should be there as some nodes may become ML nodes.
There was a problem hiding this comment.
there is a lot of conversions and deep copies: utf8->utf16->utf8->utf16. It does not look to hard to implement replaceVariables and validate with strings instead of utf8 byte buffers.
x-pack/plugin/core/src/main/java/org/elasticsearch/xpack/core/template/TemplateUtils.java
Outdated
Show resolved
Hide resolved
|
The changes in b765370 LGTM. Thank you! |
|
I believe the failing tests in |
ML mappings and index templates have so far been created programmatically. While this had its merits due to static typing, there is consensus it would be clear to maintain those in json files. In addition, we are going to adding ILM policies to these indices and the component for a plugin to register ILM policies is `IndexTemplateRegistry`. It expects the templates to be in resource json files. For the above reasons this commit refactors ML mappings and index templates into json resource files that are registered via `MlIndexTemplateRegistry`.
3b553d8 to
7ace93c
Compare
|
This has now been rebased to pick #51456. Let's see if it passes all the tests. |
|
Changes in 0aba76e also LGTM. Thanks again! |
| throw new ElasticsearchParseException("Template must not be null"); | ||
| } | ||
| if (Strings.isEmpty(source)) { | ||
| throw new ElasticsearchParseException("Template must not be empty"); |
There was a problem hiding this comment.
nit: I wonder about that: this exception type is used if a REST request is malformed. In this context this makes no sense and we consider the input immutable, right? IMHO this is more of a RTE kind of thing.
Anyway, this is out of scope for this PR, you just add another exception and follow the existing style.
| @@ -0,0 +1,36 @@ | |||
| { | |||
There was a problem hiding this comment.
can we put comments in such files? I think it would be good to leave some explanation to the reader of this. I would also add a placeholder for a changelog, in case we change the schema.
^ applies to all json files introduced
There was a problem hiding this comment.
There is no support for comments in JSON files as it stands. We could in theory support them by removing them comment lines when such files are loaded. As for a changelog, I think one of the benefits for those being files, is that we treat them similarly to code files. Thus, git serves as a change log. We can git-blame them and see the commit where each line was introduced/modified. Not sure we need more. But let's see how it goes and we add them if it appears to be needed?
droberts195
left a comment
There was a problem hiding this comment.
LGTM
Since the CI is green and this change is blocking two other pieces of work from starting I would be happy to merge this as-is and do any additional minor tweaks in followup PRs.
…stic#51765) ML mappings and index templates have so far been created programmatically. While this had its merits due to static typing, there is consensus it would be clear to maintain those in json files. In addition, we are going to adding ILM policies to these indices and the component for a plugin to register ILM policies is `IndexTemplateRegistry`. It expects the templates to be in resource json files. For the above reasons this commit refactors ML mappings and index templates into json resource files that are registered via `MlIndexTemplateRegistry`. Backport of elastic#51765
#52353) ML mappings and index templates have so far been created programmatically. While this had its merits due to static typing, there is consensus it would be clear to maintain those in json files. In addition, we are going to adding ILM policies to these indices and the component for a plugin to register ILM policies is `IndexTemplateRegistry`. It expects the templates to be in resource json files. For the above reasons this commit refactors ML mappings and index templates into json resource files that are registered via `MlIndexTemplateRegistry`. Backport of #51765
ML mappings and index templates have so far been created
programmatically. While this had its merits due to static typing,
there is consensus it would be clear to maintain those in json files.
In addition, we are going to adding ILM policies to these indices
and the component for a plugin to register ILM policies is
IndexTemplateRegistry. It expects the templates to be in resourcejson files.
For the above reasons this commit refactors ML mappings and index
templates into json resource files that are registered via
MlIndexTemplateRegistry.