Kibana version:
7.14.0-SNAPSHOT
Elasticsearch version:
7.14.0-SNAPSHOT
Server OS version:
macOS Catalina
Browser version:
Chrome 90.0.4430.212
Browser OS version:
macOS Catalina
Original install method (e.g. download page, yum, from source, etc.):
gradle run for Elasticsearch and yarn start for Kibana.
Describe the bug:
When creating a categorization job on some Elasticsearch logs using the new default categorization analyzer we are adding for 7.14 the token highlighting is completely wrong for some of the longer messages. It's not just that the wrong characters are highlighted - the highlighting process seems to have completely messed up the text too. Highlighted tokens seem to be printed on top of non-highlighted text in the wrong places.
Also, some other messages don't seem to have any tokens highlighted at all.
Steps to reproduce:
- Upload a file called
logs.csv using file upload - ping me to get a copy of this - and have the uploader create an index pattern for it
- Create a new categorization job using the categorization wizard
- On the first tab, ask to use the full time range of the data
- On the second tab let the validation take place, then look at the example messages with highlighted tokens, and hopefully you'll see something similar to the screenshot below
Expected behavior:
Token highlighting should make sense and be rendered correctly.
Screenshots (if relevant):

Kibana version:
7.14.0-SNAPSHOT
Elasticsearch version:
7.14.0-SNAPSHOT
Server OS version:
macOS Catalina
Browser version:
Chrome 90.0.4430.212
Browser OS version:
macOS Catalina
Original install method (e.g. download page, yum, from source, etc.):
gradle runfor Elasticsearch andyarn startfor Kibana.Describe the bug:
When creating a categorization job on some Elasticsearch logs using the new default categorization analyzer we are adding for 7.14 the token highlighting is completely wrong for some of the longer messages. It's not just that the wrong characters are highlighted - the highlighting process seems to have completely messed up the text too. Highlighted tokens seem to be printed on top of non-highlighted text in the wrong places.
Also, some other messages don't seem to have any tokens highlighted at all.
Steps to reproduce:
logs.csvusing file upload - ping me to get a copy of this - and have the uploader create an index pattern for itExpected behavior:
Token highlighting should make sense and be rendered correctly.
Screenshots (if relevant):