Skip to content

Ingest Node's grok can't set the same field from two patterns #22117

@tsg

Description

@tsg

Elasticsearch version: 5.0.1

Plugins installed: ingest-node-geoip, ingest-node-ua

JVM version: 1.8

OS version: macOS sierra

Description of the problem including expected versus actual behavior:

See the following Ingest node simulate API call:

POST _ingest/pipeline/_simulate
{
  "pipeline": {
    "description": "Pipeline for parsing MySQL slow logs.",
    "processors": [
      {
        "grok": {
          "field": "message",
          "patterns": [
            "%{DATA:mysql.error.timestamp} %{NUMBER:mysql.error.id} \\[%{DATA:mysql.error.level}\\] %{GREEDYDATA:mysql.error.message}",
            "%{LOCALDATETIME:mysql.error.timestamp} %{DATA:mysql.error.name} %{GREEDYDATA:mysql.error.message}"
          ],
          "ignore_missing": true,
          "pattern_definitions": {
            "LOCALDATETIME": "[0-9]+ %{TIME}"
          }
        }
      }
    ]
  },
  "docs": [
    {
      "_source": {
        "message": "161209 13:08:33 mysqld_safe Starting mysqld daemon with databases from /usr/local/var/mysql"
      }
    }
  ]
}

There are two Grok patterns, and the provided doc should match the second one. This works fine, but the mysql.error.message is not created. If I rename it to mysql.error.message1 in either of the two grok patterns, it works.

A workaround I found is to define another grok pattern definition for GREEDYDATA, like this:

POST _ingest/pipeline/_simulate
{
  "pipeline": {
    "description": "Pipeline for parsing MySQL slow logs.",
    "processors": [
      {
        "grok": {
          "field": "message",
          "patterns": [
            "%{DATA:mysql.error.timestamp} %{NUMBER:mysql.error.id} \\[%{DATA:mysql.error.level}\\] %{GREEDYDATA:mysql.error.message}",
            "%{LOCALDATETIME:mysql.error.timestamp} %{DATA:mysql.error.name} %{GREEDYDATA1:mysql.error.message}"
          ],
          "ignore_missing": true,
          "pattern_definitions": {
            "LOCALDATETIME": "[0-9]+ %{TIME}",
            "GREEDYDATA1": ".*"
          }
        }
      }
    ]
  },
  "docs": [
    {
      "_source": {
        "message": "161209 13:08:33 mysqld_safe Starting mysqld daemon with databases from /usr/local/var/mysql"
      }
    }
  ]
}

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions