Skip to content

ESQL: Push CIDR_MATCH to Lucene if possible#105061

Merged
alex-spies merged 6 commits intoelastic:mainfrom
alex-spies:push-down-cidr-match
Feb 6, 2024
Merged

ESQL: Push CIDR_MATCH to Lucene if possible#105061
alex-spies merged 6 commits intoelastic:mainfrom
alex-spies:push-down-cidr-match

Conversation

@alex-spies
Copy link
Copy Markdown
Contributor

@alex-spies alex-spies commented Feb 2, 2024

Fix #105042

@alex-spies alex-spies changed the title Push CIDR_MATCH to Lucene if possible ESQL: Push CIDR_MATCH to Lucene if possible Feb 2, 2024
@alex-spies alex-spies force-pushed the push-down-cidr-match branch from 9321ce7 to 044fe41 Compare February 2, 2024 15:27
@alex-spies alex-spies added the :Analytics/ES|QL AKA ESQL label Feb 2, 2024
@alex-spies alex-spies marked this pull request as ready for review February 2, 2024 15:31
@elasticsearchmachine
Copy link
Copy Markdown
Collaborator

Pinging @elastic/es-analytical-engine (Team:Analytics)

@elasticsearchmachine elasticsearchmachine added the Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) label Feb 2, 2024
@alex-spies alex-spies added the >bug label Feb 2, 2024
@elasticsearchmachine
Copy link
Copy Markdown
Collaborator

Hi @alex-spies, I've created a changelog YAML for you.

Copy link
Copy Markdown
Member

@costin costin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Almost there.
A follow-up is to optimize different CIDR_Match functions on the same field into one - so
WHERE CIDR_MATCH(field, "123.0.0.0/24") OR CIDR_MATCH(field, "123.1.2.4.5/24") OR CIDR_MATCH(another_field, "172.0.0.1/16") AND CIDR_MATCH(extra_field, "5.4.2.1/8")
gets optimized into
WHERE CIDR_MATCH(field, "123.0.0.0/24", "123.1.2.4.5/24") OR CIDR_MATCH(another_field, "172.0.0.1/16") AND CIDR_MATCH(extra_field, "5.4.2.1/8")

* "cidr_match(ip, \"127.0.0.0/24\")@1:19"}}][_doc{f}#21], limit[500], sort[] estimatedRowSize[389]
*/
public void testCidrMatchPushdownFilter() {
var allTypeMappingAnalyzer = makeAnalyzer("mapping-all-types.json", new EnrichResolution());
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not use the default analyzer?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The default mapping is missing a field of type ip. I'll use mapping-ip.json, though, to be more specific.

int cidrBlockCount = randomIntBetween(1, 10);
ArrayList<String> cidrBlocks = new ArrayList<>();
for (int i = 0; i < cidrBlockCount; i++) {
cidrBlocks.add("127.0." + i + ".0/24");
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See whether there's another randomization function to have a larger spectrum of IPs including IPv6.

Copy link
Copy Markdown
Contributor

@bpintea bpintea left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@astefan astefan self-requested a review February 5, 2024 14:15
@alex-spies
Copy link
Copy Markdown
Contributor Author

Thanks for reviews, Bogdan and Costin!

@costin, I looked into the optimization where we combine disjunctions of CIDR_MATCHes. I think this deserves its own issue as this might require more than just a logical optimization to actually take effect; I'll open that one.

Copy link
Copy Markdown
Contributor

@astefan astefan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Copy Markdown
Member

@costin costin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@costin
Copy link
Copy Markdown
Member

costin commented Feb 6, 2024

This is a bug so make sure to backport it in 8.12 - use the automatic way through auto-backport-and-merge label + v8.12.2.

@alex-spies alex-spies merged commit b5f4c5e into elastic:main Feb 6, 2024
@alex-spies alex-spies deleted the push-down-cidr-match branch February 6, 2024 09:23
@elasticsearchmachine
Copy link
Copy Markdown
Collaborator

💔 Backport failed

Status Branch Result
8.12 Commit could not be cherrypicked due to conflicts

You can use sqren/backport to manually backport by running backport --upstream elastic/elasticsearch --pr 105061

@alex-spies
Copy link
Copy Markdown
Contributor Author

💚 All backports created successfully

Status Branch Result
8.12

Questions ?

Please refer to the Backport tool documentation

alex-spies added a commit to alex-spies/elasticsearch that referenced this pull request Feb 6, 2024
(cherry picked from commit b5f4c5e)

# Conflicts:
#	x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/planner/EsqlTranslatorHandler.java
#	x-pack/plugin/esql/src/test/java/org/elasticsearch/xpack/esql/optimizer/LocalPhysicalPlanOptimizerTests.java
costin pushed a commit that referenced this pull request Feb 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:Analytics/ES|QL AKA ESQL backport pending >bug Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) v8.12.2 v8.13.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ESQL: pushdown CIDR_Match

5 participants