Skip to content

[Enhancement] Error handling for unsupported characters in java regex library #4467

@RyanL1997

Description

@RyanL1997

Description

Currently for extraction commands like rex and parse which are replying on the java regex library, once the user specified the named capture groups with some unsupported illegal characters lik _, -. etc. It will return a 400 as the request response with a very unclear error message. For example:

# rex command with illegal character of -
curl -X POST "localhost:9200/_plugins/_ppl" \
    -H "Content-Type: application/json" \
    -d '{
      "query": "source=accounts | rex field=email \".+@(?<domain-name>.+)\" | fields email, domain-name"
    }'
{
  "error": {
    "reason": "Invalid Query",
    "details": "Rex pattern must contain at least one named capture group",
    "type": "IllegalArgumentException"
  },
  "status": 400
}

In the above it only states that the name capture itself is invalid but no description of the actual reason.

Expected Behavior

Having a validation logic to make sure the named captured group follows the supported java regex pattern:

  • Valid Characters:
    • Must start with: Letter (a-z, A-Z)
    • Can contain: Letters (a-z, A-Z) and digits (0-9)

Having a clear error msg to call out the following invalid characters in the user provided pattern:

  • Invalid Characters (will cause PatternSyntaxException):
    • Underscore (_) - This is what we're handling with our validation
    • Hyphen/Dash (-) - This is what you tested with curl
    • Period/Dot (.)
    • Space ( )
    • Special characters (@, #, $, %, ^, &, *, etc.)
    • Unicode characters beyond ASCII
    • Cannot start with digit (0-9)

Exit Criteria

  • Leverage the existing regex util to make sure all the extraction related commands are covered by the fix / enhancement
  • Add proper testing for parse / rex to validate the new behavior
  • Update the cmd doc if needed

Reference

Metadata

Metadata

Assignees

Labels

PPLPiped processing languagecalcitecalcite migration releatedenhancementNew feature or requesterror-experienceIssues related to how we handle failure cases in the plugin.

Type

No type

Projects

Status

Done

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions