[ML] Get categories endpoint to use ECS Grok patterns#89386
[ML] Get categories endpoint to use ECS Grok patterns#89386edsavage merged 2 commits intoelastic:mainfrom
Conversation
Change the Grok pattern creator for _ml/anomaly_detectors/<job_id>/results/categories to always use ECS Grok patterns relates elastic#77065
|
Pinging @elastic/ml-core (Team:ML) |
|
Tested with e.g. With legacy Grok patterns: with ECS Grok patterns (Note the And with Legacy Grok patterns: ECS Grok patterns: (Note the LOGLEVEL capture field has been renamed from |
| // For ECS compliant Grok patterns TOMCAT_DATESTAMP is defined as: | ||
| // TOMCAT_DATESTAMP (?:%{CATALINA8_DATESTAMP})|(?:%{CATALINA7_DATESTAMP})|(?:%{TOMCATLEGACY_DATESTAMP}) | ||
| // and since the timestamps in the example messages are in CATALINA7_DATESTAMP format, TOMCAT_DATESTAMP, being at the | ||
| // front of our ORDERED_CANDIDATE_GROK_PATTERNS list, matches. |
There was a problem hiding this comment.
I think a better way to fix this is to change ORDERED_CANDIDATE_GROK_PATTERNS to have TOMCATLEGACY_DATESTAMP first instead of TOMCAT_DATESTAMP.
Patterns that try multiple options are slower to match, and it seems like this old Tomcat format is really ancient as the person who updated the Grok patterns to ECS format couldn't find an example.
This discovery also has implications for the TimestampFormatFinder class. That should also be changed to swap out TOMCAT_DATESTAMP for TOMCATLEGACY_DATESTAMP when ECS compatibility is set to v1 - please open a separate PR for that.
Change the Grok pattern creator for _ml/anomaly_detectors/<job_id>/results/categories to always use ECS Grok patterns
relates #77065