Skip to content

TO_UPPER/TO_LOWER regex pushdown does not work in Elasticsearch 8.19 #131386

@julian-elastic

Description

@julian-elastic

Elasticsearch Version

8.19

Installed Plugins

No response

Java Version

bundled

OS Version

Mac

Problem Description

TO_UPPER/TO_LOWER regex pushdown does not work in Elasticsearch 8.19. The optimization is disabled because the version of Lucene that comes with Elasticsearch 8.19 does not support Unicode CASE_INSENSITIVE automatons and allowing the pushdown will result in wrong data.
This is a task to

  1. Backport apache/lucene@7c050f9 to the version of Lucene that comes with Elasticsearch 8.19
  2. Replace ASCII_CASE_INSENSITIVE with CASE_INSENSITIVE when building automations
  3. Reenable ReplaceStringCasingWithInsensitiveRegexMatch rule
  4. Unmute tests muted by disable ReplaceStringCasingWithInsensitiveRegexMatch rule in 8.19 #131387

Steps to Reproduce

Add the following to where-like.csv-spec and run it

rlikeWithLowerTurnedInsensitiveUnicode#[skip:-8.18.99]
FROM airport_city_boundaries
| WHERE TO_UPPER(region) RLIKE ".*Л.*" and abbrev == "FRU"
| KEEP region
;

region:text
Свердлов району
;

Logs (if relevant)

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions