-
Notifications
You must be signed in to change notification settings - Fork 190
[Enhancement] Error handling for unsupported characters in java regex library #4467
Copy link
Copy link
Closed
Labels
PPLPiped processing languagePiped processing languagecalcitecalcite migration releatedcalcite migration releatedenhancementNew feature or requestNew feature or requesterror-experienceIssues related to how we handle failure cases in the plugin.Issues related to how we handle failure cases in the plugin.
Description
Description
Currently for extraction commands like rex and parse which are replying on the java regex library, once the user specified the named capture groups with some unsupported illegal characters lik _, -. etc. It will return a 400 as the request response with a very unclear error message. For example:
# rex command with illegal character of -
curl -X POST "localhost:9200/_plugins/_ppl" \
-H "Content-Type: application/json" \
-d '{
"query": "source=accounts | rex field=email \".+@(?<domain-name>.+)\" | fields email, domain-name"
}'
{
"error": {
"reason": "Invalid Query",
"details": "Rex pattern must contain at least one named capture group",
"type": "IllegalArgumentException"
},
"status": 400
}In the above it only states that the name capture itself is invalid but no description of the actual reason.
Expected Behavior
Having a validation logic to make sure the named captured group follows the supported java regex pattern:
- Valid Characters:
- Must start with: Letter (a-z, A-Z)
- Can contain: Letters (a-z, A-Z) and digits (0-9)
Having a clear error msg to call out the following invalid characters in the user provided pattern:
- Invalid Characters (will cause PatternSyntaxException):
- Underscore (_) - This is what we're handling with our validation
- Hyphen/Dash (-) - This is what you tested with curl
- Period/Dot (.)
- Space ( )
- Special characters (@, #, $, %, ^, &, *, etc.)
- Unicode characters beyond ASCII
- Cannot start with digit (0-9)
Exit Criteria
- Leverage the existing regex util to make sure all the extraction related commands are covered by the fix / enhancement
- Add proper testing for parse / rex to validate the new behavior
- Update the cmd doc if needed
Reference
- Official Java doc: https://docs.oracle.com/javase/6/docs/api/java/util/regex/Pattern.html
- Existing issue for supporting
_/-for parse command: [FEATURE] Support_/-as parsed field name #3944
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
PPLPiped processing languagePiped processing languagecalcitecalcite migration releatedcalcite migration releatedenhancementNew feature or requestNew feature or requesterror-experienceIssues related to how we handle failure cases in the plugin.Issues related to how we handle failure cases in the plugin.
Type
Projects
Status
Done