Skip to content

Regex used to parse Grok patterns prints warning with latest Joni #47861

@droberts195

Description

@droberts195

Following the Joni upgrade in #47374 the following warning is printed on Elasticsearch startup:

regular expression has redundant nested repeat operator + /%\{(?<name>(?<pattern>[A-z0-9]+)(?::(?<subname>[[:alnum:]@\[\]_:.-]+))?)(?:=(?<definition>(?:(?:[^{}]+|\.+)+)+))?\}/

I guess it means there is a non-fatal inefficiency in this regex:

private static final String GROK_PATTERN =
"%\\{" +
"(?<name>" +
"(?<pattern>[A-z0-9]+)" +
"(?::(?<subname>[[:alnum:]@\\[\\]_:.-]+))?" +
")" +
"(?:=(?<definition>" +
"(?:" +
"(?:[^{}]+|\\.+)+" +
")+" +
")" +
")?" + "\\}";
private static final Regex GROK_PATTERN_REGEX = new Regex(GROK_PATTERN.getBytes(StandardCharsets.UTF_8), 0,
GROK_PATTERN.getBytes(StandardCharsets.UTF_8).length, Option.NONE, UTF8Encoding.INSTANCE, Syntax.DEFAULT);

That regular expression has not changed for a long time, so fixing it is probably not necessary for correctness, but it will avoid questions on forums/issues/support cases if it could be fixed before release of 7.5.

Metadata

Metadata

Assignees

Labels

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions