Guard reserved tags field against incorrect use#14822
Guard reserved tags field against incorrect use#14822kaisecheng merged 31 commits intoelastic:mainfrom
Conversation
…g. If the top-level `tags` is incorrect used, event will add tag `_tagsparsefailure` and rewrite the value to `_tags`.
…lue assignment to `tags` field `rename` - rename key/value pair assignment to `_tags` field, which is the default action `warn` - allow key/value pair assignment to `tags` field, which is the old logic
|
📃 DOCS PREVIEW ✨ https://logstash_14822.docs-preview.app.elstc.co/diff |
|
📃 DOCS PREVIEW ✨ https://logstash_14822.docs-preview.app.elstc.co/diff |
|
📃 DOCS PREVIEW ✨ https://logstash_14822.docs-preview.app.elstc.co/diff |
|
📃 DOCS PREVIEW ✨ https://logstash_14822.docs-preview.app.elstc.co/diff |
| // guard tags field from key/value map, only string or list is allowed | ||
| if (ILLEGAL_TAGS_ACTION == IllegalTagsAction.RENAME) { | ||
| final Object tags = Accessors.get(data, TAGS_FIELD); | ||
| if (tags instanceof ConvertedMap) { |
There was a problem hiding this comment.
instead of a block-list of one particular bad shape, can we have an allow-list of the supported shapes?
| final FieldReference renamedField = renameIllegalTags(field); | ||
| Accessors.set(data, renamedField, Valuefier.convert(value)); |
There was a problem hiding this comment.
I'm not sure this does what we want, since it renames the field independently of the shape of its value.
It may be worth intercepting TAGS_FIELD to be handled by a helper method:
public void setField(final FieldReference field, final Object value) {
if (field.equals(TAGS_FIELD)) {
setTagsField(value);
return;
}
switch (field.type()) {
// ...
}
} private void setTagsField(final Object value)
if (isLegalTagValue(value)) {
setField(TAGS_FIELD, value);
} else if (ILLEGAL_TAGS_ACTION == IllegalTagsAction.RENAME)
setField(FieldReference.from(TAGS_FAILURE_FIELD), value);
} else {
// log warning? avoid flooding though?
setField(TAGS_FIELD, value);
}
}
private boolean isLegalTagValue(final Object value) {
if (value instanceof String) { return true; }
if (value instanceof List) { return true; } // TODO: make sure it is a list of _strings_, which is difficult to do with generics and type erasure
return false;
}There was a problem hiding this comment.
Applied the suggestion. It now handles the illegal tags field first in Event.setField()
| if (ILLEGAL_TAGS_ACTION == IllegalTagsAction.RENAME && | ||
| field.getPath() != null && field.getPath().length > 0 && field.getPath()[0].equals(TAGS)) { |
There was a problem hiding this comment.
I read this as:
- IF we are supposed to rename illegal-shaped tags
- AND the field reference has a non-empty path
- AND the first-level is
tags
In my mind, we need to handle two cases when ILLEGAL_TAGS_ACTION == IllegalTagsAction.RENAME:
(path == null || path.length == 0) && key.equals(TAGS), with a value that is neither a string nor an array of strings (setting top-leveltagsto be an illegal shape)(path != null && path.length > 0 && path[0].equals(TAGS))regardless of value (setting field nested inside thetags, which cannot have nested fields in it)
Examples that should trigger a rename using the Event.setField(String, Object):
event.setField("[tags]", Map.of("poison", "true"))event.setField("[tags][poison]", "true")
Similarly, using the Ruby APIs Event#set:
event.set("[tags]", { "poison" => "true"})event.set("[tags][poison]", "true")
There was a problem hiding this comment.
Thanks for listing the cases. The PR is changed to handle other illegal types, not limited to map. And _tags field holds a list of illegal values.
| private final int type; | ||
|
|
||
| private FieldReference(final String[] path, final String key, final int type) { | ||
| public FieldReference(final String[] path, final String key, final int type) { |
There was a problem hiding this comment.
I am wary of making this public, since it would allow plugins built with the Java plugin API to route around our field reference caching.
There was a problem hiding this comment.
I revert it to private and build the FieldReference from string instead
Co-authored-by: Ry Biesemeyer <yaauie@users.noreply.github.com>
Co-authored-by: Ry Biesemeyer <yaauie@users.noreply.github.com>
Co-authored-by: Ry Biesemeyer <yaauie@users.noreply.github.com>
Co-authored-by: Ry Biesemeyer <yaauie@users.noreply.github.com>
|
📃 DOCS PREVIEW ✨ https://logstash_14822.docs-preview.app.elstc.co/diff |
change `_tags` to a list to store all invalid value build FieldReference from string instead of a public constructor
|
📃 DOCS PREVIEW ✨ https://logstash_14822.docs-preview.app.elstc.co/diff |
|
📃 DOCS PREVIEW ✨ https://logstash_14822.docs-preview.app.elstc.co/diff |
|
📃 DOCS PREVIEW ✨ https://logstash_14822.docs-preview.app.elstc.co/diff |
|
📃 DOCS PREVIEW ✨ https://logstash_14822.docs-preview.app.elstc.co/diff |
|
A few thoughts on the behaviour we have on On the Java side, setField is (too?) lenient, and allows us to break the event: From this I think we can create a similar experience for the tags field:
The main reason not to blindly crash on operations in 2. is that we may breaking some EXTREMELY edge cases out there in the world where a pipeline takes a fully created event and in the filter section sets a value to "tags" that isn't a string or list of strings. Considering this PR helps the "new event with broken 'tags'" use case already, I am comfortable throwing exceptions on setField. This should simplify the logic we need to support, and make the experience similar for the reserved fields |
|
For illustration, here is the tags behaviour in this PR when |
…istory - any set operation on event throws exception
|
📃 DOCS PREVIEW ✨ https://logstash_14822.docs-preview.app.elstc.co/diff |
jsvd
left a comment
There was a problem hiding this comment.
I believe we can now remove rebaseOnto and all its associated tests?
|
📃 DOCS PREVIEW ✨ https://logstash_14822.docs-preview.app.elstc.co/diff |
|
@jsvd I have changed to rename illegal value to |
jsvd
left a comment
There was a problem hiding this comment.
I'm good with the behaviour, I just think we should find another name to describe the default behaviour, and then ensure the docs reflect that.
Some of the docs still describe the behaviour as if "rename" happens both on creation and set, and it's actually "rename on creation, reject on set"
Co-authored-by: João Duarte <jsvd@users.noreply.github.com>
Co-authored-by: João Duarte <jsvd@users.noreply.github.com>
|
📃 DOCS PREVIEW ✨ https://logstash_14822.docs-preview.app.elstc.co/diff |
|
📃 DOCS PREVIEW ✨ https://logstash_14822.docs-preview.app.elstc.co/diff |
|
@logstashmachine backport 8.7 |
|
For future reference: This change was release-noted in https://www.elastic.co/guide/en/logstash/current/logstash-8-7-0.html . |
Fixed: #14711
Release notes
The reserved top-level
tagsfield has required a certain shape since v7.9. It supposes to be a string or an array of strings. Assigning a map totagsfield could crash unexpectedly. The pull request #14822 guardstagsfield against incorrect use. Assigning a map will result in having a_tagsparsefailureintagsfield and the illegal value will be written to_tagsfield.A new setting
--event_api.tags.illegalis added for backward compatibility. Since 8.7, the default value isrenameto move illegal value to_tagsfield.warnmaintains the same behavior of allowing illegal value assignment totagsfield.What does this PR do?
This PR prevents the reserved
tagsfield to be assigned with key/value map. Top-leveltagsshould only accept string of list. Whentagsget a map value, LogStash::Event will rewrite the value to_tagsand add_tagsparsefailuretotags.After the change,
tagsdoes not hold map value, while it uses to be allowed withruby { code => "event.set('[tags][k]' , 'v');" }. This means the unconsumed events in PQ and DLQ with a map value intagswill have a different event structure._tagshold illegal value fromtags.To make this change backward compatible, this PR adds a flag
--event_api.tags.illegalto allow fallback to old logic.There are two options.
warn- the old flow that allows illegal value assignment totagsfield.rename- the new flow that assigns illegal value to_tagsfield and adds_tagsparsefailuretotags. This is the default value in8.7Why is it important/What is the impact to the user?
Prior to this change, when a json payload contains
tagswith map value, the value is set successfully but if there is any error from plugins that needs to add a failure tag totags, such as_jsonparsefailureor_timestampparsefailure, the pipeline crashes as it cannot append a tag to a map value. This is bad because users lose visibility of what went wrong.Checklist
Author's Checklist
How to test this PR locally
Follow the reproducer of #14711
Extra test case
Expected output:
tags=> [_tagsparsefailure,_rubyexception]_tags=> {k=>v}Related issues
Use cases
Screenshots
Logs