Add support to define routing rules in data streams#535
Conversation
Increase following version to 2.9.0
🌐 Coverage report
|
|
@mrodm is this one ready to be reviewed? |
Not yet, pending some changes. I'm working to include the |
| allOf: | ||
| - if: | ||
| required: | ||
| - routing_rules | ||
| then: | ||
| required: | ||
| - dataset |
There was a problem hiding this comment.
In case routing_rules is defined, dataset becomes a required field.
|
/test |
| type: object | ||
| additionalProperties: false | ||
| properties: | ||
| dataset: |
There was a problem hiding this comment.
What would be more representative here ? dataset or package?
There was a problem hiding this comment.
The option in the processor is called dataset, right? Though it seems it could accept a package name.
I would use dataset, to be coherent with the processor, or name if we want to be generic enough, but maybe name is too generic (would it be the option passed to the processor or would it be a name for the route itself?).
There was a problem hiding this comment.
If I understand correctly the docs and the example in the issue #514, this field would define the source dataset to be used by the reroute processor to check for documents to be routed to another dataset. This latter dataset would be the ones defined in each element under rules key.
Being that, would dataset be a good name? another options that I can think of source, source_dataset? @jsoriano @juliaElastic
If so, I would add these description to these fields
--- spec/integration/data_stream/manifest.spec.yml
+++ spec/integration/data_stream/manifest.spec.yml
@@ -364,8 +364,10 @@ spec:
additionalProperties: false
properties:
dataset:
+ description: Source dataset to be used for this reroute processsor
type: string
rules:
+ description: List of routing rules
type: array
items:
$ref: "#/definitions/routing_rule"These definitions are now defined in its own file spec/integration/data_stream/routing_rules.spec.yml
There was a problem hiding this comment.
I like source_dataset and target_dataset, it is very easy to understand.
| dynamic_namespace: | ||
| description: When set to true, agents running this integration are granted data stream privileges for all namespaces of its type | ||
| type: boolean | ||
| routing_rules: # technical preview |
There was a problem hiding this comment.
Tried to install a package with this new routing_rules key in 8.7.1 and it is installed successfully. No error and no reference to those processors neither.
Should we add some validation rule for this ? Would it be needed to set here some validation rule regarding minimum kibana version? Currently, it would be like ignored by Fleet.
There was a problem hiding this comment.
It depends, if we consider this as an additional feature that will work only where supported, then I guess it is fine to leave it without the check.
But in theory we should add a validation to avoid confusion.
There was a problem hiding this comment.
Probably, it could be kept as it is without validation rule
spec/changelog.yml
Outdated
| - description: Add support to define routing rules in data streams (technical preview) | ||
| type: enhancement | ||
| link: https://github.com/elastic/package-spec/pull/535 |
There was a problem hiding this comment.
Added technical preview since it is documented also that reroute processors is in technical preview too. https://www.elastic.co/guide/en/elasticsearch/reference/current/reroute-processor.html
Is that ok ? @jsoriano @juliaElastic
spec/changelog.yml
Outdated
| - description: Add support to define routing rules in data streams (technical preview) | ||
| type: enhancement | ||
| link: https://github.com/elastic/package-spec/pull/535 |
| hidden: | ||
| description: Makes the data stream hidden | ||
| type: boolean | ||
| route: |
There was a problem hiding this comment.
So we are not defining the routes on separate files at the end then?
There was a problem hiding this comment.
As I wanted to remove this new key routing_rules in case spec version is < 2.9.0, I had to do in this way (adding it to the manifest).
Adding it as a new file, I would have to remove the entry also in the spec.yml but I think it is not supported json patches in those files.
There was a problem hiding this comment.
Ah yes, but well, we should give priority to how we want packages to look like. If this is the only reason to keep this in the manifest, lets add support for json patches in the folder spec files.
There was a problem hiding this comment.
I'll check how to take into account JSON patches in folder spec files
There was a problem hiding this comment.
Support to add JSON patches in folder spec files will be added in this PR
#540
There was a problem hiding this comment.
There is a possible drawback of moving this to its own file.
If routing_rules are defined in it own file, the condition to set as required the dataset field , I think it cannot be achieved. They would be in different files.
Referring to this block:
allOf:
- if:
required:
- routing_rules
then:
required:
- datasetThere was a problem hiding this comment.
Added a new validation rule in code for this requirement, since it cannot be achieved with if-then-else block in JSON schema.
| if: | ||
| type: string | ||
| examples: | ||
| - "ctx?.file?.path?.contains('/var/log/nginx/error')" | ||
| - "ctx?.container?.image?.name == 'nginx'" |
There was a problem hiding this comment.
I hope this if is not interpreted by json schema 🙂
There was a problem hiding this comment.
I've been doing some tests and it works as expected.
To be considered that if as an if-then-else block, I would say it has to be at the same level (indentation) as properties key.
| dynamic_namespace: | ||
| description: When set to true, agents running this integration are granted data stream privileges for all namespaces of its type | ||
| type: boolean | ||
| routing_rules: # technical preview |
There was a problem hiding this comment.
It depends, if we consider this as an additional feature that will work only where supported, then I guess it is fine to leave it without the check.
But in theory we should add a validation to avoid confusion.
| hidden: | ||
| description: Makes the data stream hidden | ||
| type: boolean | ||
| route: |
There was a problem hiding this comment.
Nit. Call this routing_rule?
| route: | |
| routing_rule: |
| type: object | ||
| additionalProperties: false | ||
| properties: | ||
| dataset: |
There was a problem hiding this comment.
The option in the processor is called dataset, right? Though it seems it could accept a package name.
I would use dataset, to be coherent with the processor, or name if we want to be generic enough, but maybe name is too generic (would it be the option passed to the processor or would it be a name for the route itself?).
| type: object | ||
| properties: | ||
| dataset: | ||
| type: string |
There was a problem hiding this comment.
If this corresponds to dataset in the reroute processor, this can be also a list: https://www.elastic.co/guide/en/elasticsearch/reference/current/reroute-processor.html
There was a problem hiding this comment.
Updated.
Reading that definition, I set both string and array of strings for both dataset and namespace keys.
| type: array | ||
| items: |
There was a problem hiding this comment.
Try this if you want to try to make this an object instead of an array:
| type: array | |
| items: | |
| type: object | |
| additionalProperties: |
There was a problem hiding this comment.
Applying this suggestion:
--- spec/integration/data_stream/manifest.spec.yml
+++ spec/integration/data_stream/manifest.spec.yml
@@ -358,20 +358,11 @@ spec:
type: boolean
routing_rules: # technical preview
description: Routing rules set.
- type: array
- items:
- type: object
- additionalProperties: false
- properties:
- dataset:
- type: string
- rules:
- type: array
- items:
- $ref: "#/definitions/routing_rule"
- required:
- - dataset
- - rules
+ type: object
+ additionalProperties:
+ type: array
+ items:
+ $ref: "#/definitions/routing_rule"
allOf:
- if:
required:works and it would be like in the example given in the issue.
But it cannot be applied, since some elements under routing_rules could contain dots, for instance k8s.router. As it is implemented now package-spec, every dot in these keys are expanded as objects (link)
I think we would need to keep this a list of elements.
Example of the error:
2023/06/14 12:57:27 Warning: package using an unreleased version of the spec (2.9.0-next)
validator_test.go:476:
Error Trace: /home/mariorodriguez/Coding/work/package-spec/code/go/pkg/validator/validator_test.go:476
Error: Received unexpected error:
found 1 validation error:
1. file "../../../../test/packages/good_v2/data_stream/routing_rules/manifest.yml" is invalid: field routing_rules.k8s: Invalid type. Expected: array, given: object
Test: TestValidateWarnings/good_v2
As routing rules are not defined in its own file, its definition has been updated to be an array directly. No need to have a "routing_rules" key. Updated description for source dataset.
| {fn: semantic.ValidateILMPolicyPresent, since: semver.MustParse("2.0.0"), types: []string{"integration"}}, | ||
| {fn: semantic.ValidateProfilingNonGA, types: []string{"integration"}}, | ||
| {fn: semantic.ValidateKibanaObjectIDs, types: []string{"integration"}}, | ||
| {fn: semantic.ValidateRoutingRulesAndDataset, types: []string{"integration"}, since: semver.MustParse("2.9.0")}, |
There was a problem hiding this comment.
Added here the since, to avoid checking for routing rules before 2.9.0.
It has been added a JSON Patch to remove it before 2.9.0.
jsoriano
left a comment
There was a problem hiding this comment.
LGTM, some nitpicking but can be ignored 🙂
| type: string | ||
| rules: | ||
| description: List of routing rules | ||
| type: array |
There was a problem hiding this comment.
Nit. Maybe add a comment here mentioning that this is not an object because using the source dataset as key would require to support keys with dots.
There was a problem hiding this comment.
Added, better to keep it as a comment
Co-authored-by: Jaime Soriano Pastor <jaime.soriano@elastic.co>
Adding minItems key results in an error: minItems must be of an integer, that could be related to the conversions performed in package-spec between YAML and JSON formats.
What does this PR do?
This PR adds support for routing rules for data streams.
If routing_rules are defined, then
datasetfield becomes mandatory to be set too.This new feature will be part of the following package-spec version 2.9.0 (updated changelog accordingly).
These routing rules definitions are going to be defined in a new file under each data stream folder. The file name would be
routing_rules.ymlExample of the contents/definitions of this new file (following example in #514):
Why is it important?
Integration packages need to expose their routing rules as part of their data stream manifest files.
Checklist
test/packagesthat prove my change is effective.good_v2bad_routing_rulesspec/changelog.yml.Related issues
How to test
elastic-packagewith the latest changeset from this branch. Example:datasetfield.