Skip to content

Conversation

@adriangb
Copy link
Contributor

@adriangb adriangb commented Nov 2, 2025

Background

This PR is part of an EPIC to push down hash table references from HashJoinExec into scans. The EPIC is tracked in #17171.

A "target state" is tracked in #18393.
There is a series of PRs to get us to this target state in smaller more reviewable changes that are still valuable on their own:

Changes in this PR

  • Enhance InListExpr to efficiently store homogeneous lists as arrays and avoid a conversion to Vec
    by adding an internal InListStorage enum with Array and Exprs variants
  • Re-use existing hashing and comparison utilities to support Struct arrays and other complex types
  • Add public function in_list_from_array(expr, list_array, negated) for creating InList from arrays

Although the diff looks large most of it is actually tests and docs. I think the actual code change is a negative LOC change, or at least negative complexity (eliminates a trait, a macro, matching on data types).

@github-actions github-actions bot added physical-expr Changes to the physical-expr crates sqllogictest SQL Logic Tests (.slt) common Related to common crate proto Related to proto crate physical-plan Changes to the physical-plan crate labels Nov 2, 2025
Comment on lines 318 to 320
// TODO: serialize the inner ArrayRef directly to avoid materialization into literals
// by extending the protobuf definition to support both representations and adding a public
// accessor method to InListExpr to get the inner ArrayRef
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll create a followup issue once we merge this

05)--------ProjectionExec: expr=[]
06)----------CoalesceBatchesExec: target_batch_size=8192
07)------------FilterExec: substr(md5(CAST(value@0 AS Utf8View)), 1, 32) IN ([7f4b18de3cfeb9b4ac78c381ee2ad278, a, b, c])
07)------------FilterExec: substr(md5(CAST(value@0 AS Utf8View)), 1, 32) IN (SET) ([7f4b18de3cfeb9b4ac78c381ee2ad278, a, b, c])
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is because we now support Utf8View for building the sets 😄

Comment on lines 565 to 572
let random_state = RandomState::with_seed(0);
let mut hashes_buf = vec![0u64; array.len()];
let Ok(hashes_buf) = create_hashes_from_arrays(
&[array.as_ref()],
&random_state,
&mut hashes_buf,
) else {
unreachable!("Failed to create hashes for InList array. This shouldn't happen because make_set should have succeeded earlier.");
};
hashes_buf.hash(state);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could pre-compute and store a hash: u64 which would be both more performant when Hash is called and avoid this error, but it would add more complexity and some overhead when building the InListExpr

alamb added a commit to alamb/datafusion that referenced this pull request Nov 7, 2025
## Background

This PR is part of an EPIC to push down hash table references from
HashJoinExec into scans. The EPIC is tracked in
apache#17171.

A "target state" is tracked in
apache#18393.
There is a series of PRs to get us to this target state in smaller more
reviewable changes that are still valuable on their own:
- (This PR): apache#18448
- apache#18449 (depends on
apache#18448)
- apache#18451

## Changes in this PR

Change create_hashes and related functions to work with &dyn Array
references instead of requiring ArrayRef (Arc-wrapped arrays). This
avoids unnecessary Arc::clone() calls and enables calls that only have
an &dyn Array to use the hashing utilities.

- Add create_hashes_from_arrays(&[&dyn Array]) function
- Refactor hash_dictionary, hash_list_array, hash_fixed_list_array to
use references instead of cloning
- Extract hash_single_array() helper for common logic

---------

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
@github-actions github-actions bot removed common Related to common crate physical-plan Changes to the physical-plan crate labels Nov 7, 2025
/// supported. Returns None otherwise. See [`LiteralGuarantee::analyze`] to
/// create these structures from an predicate (boolean expression).
fn new<'a>(
fn new(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's worth discussing in this review how far we propagate the changes.

In particular, InListExpr will now have two operations modes:

  1. Was built with an ArrayRef or was able to build an ArrayRef from a homogeneously typed Vec<Arc<dyn PhysicalExpr>> which are all literals.
  2. Was built with a Vec<Arc<dyn PhysicalExpr>> which are not literals or homogeneously typed.

If we restrict LiteralGuarantee to only operate on the first cases, I think we could lift out a lot of computation: instead of transforming ArrayRef -> Vec<Arc<dyn PhysicalExpr>> -> Vec<ScalarValue> -> HashSet<ScalarValue> which then gets fed into bloom filters which are per-column and don't really support heterogenous ScalarValues we could re-use the already deduplicated ArraySet that InListExpr has internally or something. The ultimate thing to do, but that would require even more work and changes, would be to make PruningPredicate::contains accept an enum ArrayOrScalars { Array(ArrayRef), Scalars(Vec<ScalarValue>) } so that we can push down and iterate over the Arc'ed ArrayRef the whole way down. I think we could make this backwards compatible.

I think that change is worth it, but it requires a bit more coordination (with arrow-rs) and a bigger change.

The end result would be that:

  1. When you create an InListExpr operates in mode (1) we are able to push down into bloom filters with no data copies at all.
  2. When the InListExpr operates in mode (2) we'd bail on the pushdown early (e.g. list() -> Option<ArrayRef>) and avoid building the HashSet<ScalarValue>, etc. that won't be used.

Wdyt @alamb ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay I've looked into this and it is entirely possible, I think we should do it.
Basically the status quo currently is that we always try to build an ArrayHashSet which is only possible if we can convert the Vec<ScalarValue> into an ArrayRef.

At that point the only reason to store the Vec<SclarValue> is to later pass it into PruningPredicate -> bloom filters and LiteralGuarantee. If we can refactor those two to also handle an ArrayRef we could probably avoid a lot of cloning and make things more efficient by using arrays. I don't even think we need to support Vec<ScalarValue> at all: the only reason to have that is if you could not build a homogeneously typed array, and if that is the case you probably can't do any sort of pushdown into a bloom filter. E.g. select variant_get(col, 'abc') in (1, 2.0, 'c') might make sense and work but I don't think we could ever push that down into a bloom filter. So InListExpr needs to operate on both but I don't think the pruning machinery does.

So anyway I think I may try to reduce this change to only be about using create_hashes and ignore any inefficiencies as a TODO for a followup issue. At the end of the day if we can make HashJoinExec faster even if that's with some inefficiencies I think that's okay and we can improve more later.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll record a preview of some of the changes I had made to explore this (by no means ready) just for future reference: https://github.com/pydantic/datafusion/compare/refactor-in-list...pydantic:datafusion:use-array-in-pruning?expand=1

@github-actions github-actions bot removed the proto Related to proto crate label Nov 9, 2025
@adriangb adriangb changed the title Refactor InListExpr to store arrays and support structs Refactor InListExpr to support structs by re-using existing hashing infrastructure Nov 9, 2025
Comment on lines -69 to -73
pub trait Set: Send + Sync {
fn contains(&self, v: &dyn Array, negated: bool) -> Result<BooleanArray>;
fn has_nulls(&self) -> bool;
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We get rid of the Set trait. The only implementer was ArraySet

Comment on lines 186 to 190
array => Arc::new(ArraySet::new(array, make_hash_set(array))),
DataType::Boolean => {
let array = as_boolean_array(array)?;
Arc::new(ArraySet::new(array, make_hash_set(array)))
},
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We get rid of this type matching logic

Comment on lines -243 to -252
trait IsEqual: HashValue {
fn is_equal(&self, other: &Self) -> bool;
}

impl<T: IsEqual + ?Sized> IsEqual for &T {
fn is_equal(&self, other: &Self) -> bool {
T::is_equal(self, other)
}
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We get rid of these custom equality / hash traits

@alamb
Copy link
Contributor

alamb commented Nov 9, 2025

🤖 ./gh_compare_branch_bench.sh Benchmark Script Running
Linux aal-dev 6.14.0-1018-gcp #19~24.04.1-Ubuntu SMP Wed Sep 24 23:23:09 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Comparing refactor-in-list (f412ead) to 1d8bc9b diff
BENCH_NAME=in_list
BENCH_COMMAND=cargo bench --bench in_list
BENCH_FILTER=
BENCH_BRANCH_NAME=refactor-in-list
Results will be posted here when complete

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @adriangb

I looked through the code and the basic idea makes a lot of sense to me 👍

I kicked off some benchmarks to see what impact, if any, this change has on performance. Assuming it is the same or better, I think it would be good to merge

I do suggest adding some slt level logic for struct IN lists if we don't already have some, but I don't think it is necessary

false => Some(negated),
}
})
let mut hashes_buf = vec![0u64; v.len()];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a follow on PR we could potentially look into reusing this hashes_buf -- aka rather than reallocating each invocations of contains instead make a field (probably needs to be a Mutex or something) that is a Vec

})
let mut hashes_buf = vec![0u64; v.len()];
create_hashes([v], &self.state, &mut hashes_buf)?;
let cmp = make_comparator(v, in_array, SortOptions::default())?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the comparator is some dynamic function -- the overhead of using the dynamic dispatch in the critical path may be substantial).

If it turns out to be too slow, we can potentially create specializations for comparisons (aka make a speicalized hash set for the different physical array types, and fall back to the dynamic comparator)

///
/// The `list` field will be empty when using this constructor, as the array is stored
/// directly in the static filter.
pub fn in_list_from_array(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if it would be more discoverable if this was a method on InList rather than a free function

Something like

impl InLIst
  fn new_from_array(  expr: Arc<dyn PhysicalExpr>,
    array: ArrayRef,
    negated: bool,
   ) -> Result<Self> {
...
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I agree, I was just following the existing patterns

}

#[test]
fn in_list_struct() -> Result<()> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we also please add some .slt level tests for IN on a set?

@alamb
Copy link
Contributor

alamb commented Nov 9, 2025

🤖: Benchmark completed

Details

group                                       main                                   refactor-in-list
-----                                       ----                                   ----------------
in_list_f32 (1024, 0) IN (1, 0)             1.00      4.3±0.01µs        ? ?/sec    1.14      4.9±0.01µs        ? ?/sec
in_list_f32 (1024, 0) IN (10, 0)            1.00      4.3±0.01µs        ? ?/sec    1.14      4.9±0.01µs        ? ?/sec
in_list_f32 (1024, 0) IN (100, 0)           1.00      4.2±0.03µs        ? ?/sec    1.14      4.9±0.01µs        ? ?/sec
in_list_f32 (1024, 0) IN (3, 0)             1.00      4.2±0.01µs        ? ?/sec    1.15      4.9±0.01µs        ? ?/sec
in_list_f32 (1024, 0.2) IN (1, 0)           1.00      5.5±0.06µs        ? ?/sec    1.36      7.5±0.03µs        ? ?/sec
in_list_f32 (1024, 0.2) IN (10, 0)          1.00      5.5±0.06µs        ? ?/sec    1.40      7.7±0.07µs        ? ?/sec
in_list_f32 (1024, 0.2) IN (100, 0)         1.00      5.6±0.03µs        ? ?/sec    1.23      6.9±0.06µs        ? ?/sec
in_list_f32 (1024, 0.2) IN (3, 0)           1.00      5.5±0.06µs        ? ?/sec    1.38      7.6±0.03µs        ? ?/sec
in_list_i32 (1024, 0) IN (1, 0)             1.00      4.2±0.01µs        ? ?/sec    1.16      4.9±0.01µs        ? ?/sec
in_list_i32 (1024, 0) IN (10, 0)            1.00      4.2±0.01µs        ? ?/sec    1.16      4.9±0.01µs        ? ?/sec
in_list_i32 (1024, 0) IN (100, 0)           1.00      4.2±0.01µs        ? ?/sec    1.16      4.9±0.01µs        ? ?/sec
in_list_i32 (1024, 0) IN (3, 0)             1.00      4.2±0.01µs        ? ?/sec    1.16      4.9±0.01µs        ? ?/sec
in_list_i32 (1024, 0.2) IN (1, 0)           1.00      5.7±0.02µs        ? ?/sec    1.31      7.5±0.04µs        ? ?/sec
in_list_i32 (1024, 0.2) IN (10, 0)          1.00      5.6±0.01µs        ? ?/sec    1.29      7.2±0.04µs        ? ?/sec
in_list_i32 (1024, 0.2) IN (100, 0)         1.00      5.5±0.03µs        ? ?/sec    1.22      6.7±0.04µs        ? ?/sec
in_list_i32 (1024, 0.2) IN (3, 0)           1.00      5.7±0.04µs        ? ?/sec    1.31      7.5±0.03µs        ? ?/sec
in_list_utf8(10) (1024, 0) IN (1, 0)        1.09      5.5±0.02µs        ? ?/sec    1.00      5.1±0.01µs        ? ?/sec
in_list_utf8(10) (1024, 0) IN (10, 0)       1.09      5.5±0.01µs        ? ?/sec    1.00      5.1±0.01µs        ? ?/sec
in_list_utf8(10) (1024, 0) IN (100, 0)      1.05      5.5±0.05µs        ? ?/sec    1.00      5.3±0.03µs        ? ?/sec
in_list_utf8(10) (1024, 0) IN (3, 0)        1.13      5.7±0.06µs        ? ?/sec    1.00      5.1±0.01µs        ? ?/sec
in_list_utf8(10) (1024, 0.2) IN (1, 0)      1.00      7.6±0.04µs        ? ?/sec    1.05      8.0±0.07µs        ? ?/sec
in_list_utf8(10) (1024, 0.2) IN (10, 0)     1.00      7.8±0.04µs        ? ?/sec    1.01      7.8±0.03µs        ? ?/sec
in_list_utf8(10) (1024, 0.2) IN (100, 0)    1.03      8.0±0.09µs        ? ?/sec    1.00      7.8±0.07µs        ? ?/sec
in_list_utf8(10) (1024, 0.2) IN (3, 0)      1.00      7.7±0.04µs        ? ?/sec    1.06      8.1±0.06µs        ? ?/sec
in_list_utf8(20) (1024, 0) IN (1, 0)        1.10      5.6±0.19µs        ? ?/sec    1.00      5.1±0.01µs        ? ?/sec
in_list_utf8(20) (1024, 0) IN (10, 0)       1.09      5.5±0.01µs        ? ?/sec    1.00      5.1±0.01µs        ? ?/sec
in_list_utf8(20) (1024, 0) IN (100, 0)      1.09      5.5±0.01µs        ? ?/sec    1.00      5.1±0.01µs        ? ?/sec
in_list_utf8(20) (1024, 0) IN (3, 0)        1.04      5.5±0.01µs        ? ?/sec    1.00      5.3±0.01µs        ? ?/sec
in_list_utf8(20) (1024, 0.2) IN (1, 0)      1.00      7.9±0.03µs        ? ?/sec    1.01      8.0±0.03µs        ? ?/sec
in_list_utf8(20) (1024, 0.2) IN (10, 0)     1.00      7.9±0.10µs        ? ?/sec    1.00      7.9±0.06µs        ? ?/sec
in_list_utf8(20) (1024, 0.2) IN (100, 0)    1.00      7.8±0.03µs        ? ?/sec    1.00      7.8±0.05µs        ? ?/sec
in_list_utf8(20) (1024, 0.2) IN (3, 0)      1.02      8.2±0.08µs        ? ?/sec    1.00      8.0±0.05µs        ? ?/sec
in_list_utf8(5) (1024, 0) IN (1, 0)         1.12      5.7±0.05µs        ? ?/sec    1.00      5.1±0.01µs        ? ?/sec
in_list_utf8(5) (1024, 0) IN (10, 0)        1.09      5.5±0.01µs        ? ?/sec    1.00      5.1±0.01µs        ? ?/sec
in_list_utf8(5) (1024, 0) IN (100, 0)       1.09      5.5±0.01µs        ? ?/sec    1.00      5.0±0.02µs        ? ?/sec
in_list_utf8(5) (1024, 0) IN (3, 0)         1.09      5.5±0.02µs        ? ?/sec    1.00      5.1±0.01µs        ? ?/sec
in_list_utf8(5) (1024, 0.2) IN (1, 0)       1.00      7.6±0.02µs        ? ?/sec    1.03      7.8±0.08µs        ? ?/sec
in_list_utf8(5) (1024, 0.2) IN (10, 0)      1.00      7.8±0.05µs        ? ?/sec    1.00      7.8±0.03µs        ? ?/sec
in_list_utf8(5) (1024, 0.2) IN (100, 0)     1.00      7.7±0.03µs        ? ?/sec    1.00      7.7±0.06µs        ? ?/sec
in_list_utf8(5) (1024, 0.2) IN (3, 0)       1.04      8.0±0.04µs        ? ?/sec    1.00      7.7±0.05µs        ? ?/sec

@adriangb
Copy link
Contributor Author

adriangb commented Nov 9, 2025

It looks like there are indeed some regressions. I propose we do two things:

  1. Add a create_hashes_unbuffered(…) -> &[u64] that uses a thread local to re-use the buffer. I think this will be helpful in other contexts as well.
  2. Create a make_typed_comparator that returns an enum that is typed for non-recursive types and delegates to a fallback dynamically typed variant for recursive types. I’ll implement it here for now but make a note that it would be good to upstream into arrow. When it is up streamed into arrow we can re-implement the current version in terms of their new version and deprecate the current function.

I think that will get us the broader type support and code re-use while avoiding any slowdown. Once we do the upstreaming into arrow it won’t even be any more code than it is now (a bit more code in arrow but not even that much). And we should be able to do it all in one PR here

@github-actions github-actions bot added the common Related to common crate label Nov 10, 2025
alamb and others added 2 commits November 18, 2025 17:03
* Consolidate StaticFilter and ArrayHashSet

* Fix docs
@adriangb
Copy link
Contributor Author

I'm surprised that doing dynamic dispatch once per batch we evaluate as opposed to twice per batch we evaluate makes that much of a difference. What would make sense that makes a difference to me is doing it once per element vs. once per batch. But I guess that's what benchmarks say!

That does leave me with a question... could we squeeze out even more performance if we specialize for ~ all scalar types? It wouldn't be that hard to write a macro and have AI do the copy pasta of implementing it for all of the types... I'll open a follow up ticket.

@adriangb
Copy link
Contributor Author

Also thank you for your help getting this across the line @alamb! I'm excited to continue the work.

@adriangb adriangb added this pull request to the merge queue Nov 18, 2025
Merged via the queue into apache:main with commit 486c5d8 Nov 18, 2025
32 checks passed
@adriangb adriangb deleted the refactor-in-list branch November 18, 2025 23:37
@alamb
Copy link
Contributor

alamb commented Nov 19, 2025

That does leave me with a question... could we squeeze out even more performance if we specialize for ~ all scalar types? It wouldn't be that hard to write a macro and have AI do the copy pasta of implementing it for all of the types... I'll open a follow up ticket.

Yes this is what I think we should do

}

ArrayHashSet { state, map }
struct Int32StaticFilter {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh yeah, we should totally do the same thing here for the other types. I'll file a ticket to track that

@alamb
Copy link
Contributor

alamb commented Nov 19, 2025

github-merge-queue bot pushed a commit that referenced this pull request Nov 20, 2025
…for more precise filters (#18451)

## Background

This PR is part of an EPIC to push down hash table references from
HashJoinExec into scans. The EPIC is tracked in
#17171.

A "target state" is tracked in
#18393.
There is a series of PRs to get us to this target state in smaller more
reviewable changes that are still valuable on their own:
- #18448
- #18449 (depends on
#18448)
- (This PR): #18451

## Changes in this PR

This PR refactors state management in HashJoinExec to make filter
pushdown more efficient and prepare for pushing down membership tests.

- Refactor internal data structures to clean up state management and
make usage more idiomatic (use `Option` instead of comparing integers,
etc.)
- Uses CASE expressions to evaluate pushed-down filters selectively by
partition Example: `CASE hash_repartition % N WHEN partition_id THEN
condition ELSE false END`

---------

Co-authored-by: Lía Adriana <lia.castaneda@datadoghq.com>
logan-keede pushed a commit to logan-keede/datafusion that referenced this pull request Nov 23, 2025
…nfrastructure (apache#18449)

## Background

This PR is part of an EPIC to push down hash table references from
HashJoinExec into scans. The EPIC is tracked in
apache#17171.

A "target state" is tracked in
apache#18393.
There is a series of PRs to get us to this target state in smaller more
reviewable changes that are still valuable on their own:
- apache#18448
- (This PR): apache#18449 (depends on
apache#18448)
- apache#18451

## Changes in this PR

- Enhance InListExpr to efficiently store homogeneous lists as arrays
and avoid a conversion to Vec<PhysicalExpr>
  by adding an internal InListStorage enum with Array and Exprs variants
- Re-use existing hashing and comparison utilities to support Struct
arrays and other complex types
- Add public function `in_list_from_array(expr, list_array, negated)`
for creating InList from arrays

Although the diff looks large most of it is actually tests and docs. I
think the actual code change is a negative LOC change, or at least
negative complexity (eliminates a trait, a macro, matching on data
types).

---------

Co-authored-by: David Hewitt <mail@davidhewitt.dev>
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
logan-keede pushed a commit to logan-keede/datafusion that referenced this pull request Nov 23, 2025
…for more precise filters (apache#18451)

## Background

This PR is part of an EPIC to push down hash table references from
HashJoinExec into scans. The EPIC is tracked in
apache#17171.

A "target state" is tracked in
apache#18393.
There is a series of PRs to get us to this target state in smaller more
reviewable changes that are still valuable on their own:
- apache#18448
- apache#18449 (depends on
apache#18448)
- (This PR): apache#18451

## Changes in this PR

This PR refactors state management in HashJoinExec to make filter
pushdown more efficient and prepare for pushing down membership tests.

- Refactor internal data structures to clean up state management and
make usage more idiomatic (use `Option` instead of comparing integers,
etc.)
- Uses CASE expressions to evaluate pushed-down filters selectively by
partition Example: `CASE hash_repartition % N WHEN partition_id THEN
condition ELSE false END`

---------

Co-authored-by: Lía Adriana <lia.castaneda@datadoghq.com>
github-merge-queue bot pushed a commit that referenced this pull request Dec 9, 2025
… on the size of the build side (#18393)

This PR is part of an EPIC to push down hash table references from
HashJoinExec into scans. The EPIC is tracked in
#17171.

A "target state" is tracked in
#18393 (*this PR*).
There is a series of PRs to get us to this target state in smaller more
reviewable changes that are still valuable on their own:
- #18448
- #18449 (depends on
#18448)
- #18451

As those are merged I will rebase this PR to keep track of the
"remaining work", and we can use this PR to explore big picture ideas or
benchmarks of the final state.
LiaCastaneda pushed a commit to DataDog/datafusion that referenced this pull request Dec 17, 2025
This PR is part of an EPIC to push down hash table references from
HashJoinExec into scans. The EPIC is tracked in
apache#17171.

A "target state" is tracked in
apache#18393.
There is a series of PRs to get us to this target state in smaller more
reviewable changes that are still valuable on their own:
- (This PR): apache#18448
- apache#18449 (depends on
apache#18448)
- apache#18451

Change create_hashes and related functions to work with &dyn Array
references instead of requiring ArrayRef (Arc-wrapped arrays). This
avoids unnecessary Arc::clone() calls and enables calls that only have
an &dyn Array to use the hashing utilities.

- Add create_hashes_from_arrays(&[&dyn Array]) function
- Refactor hash_dictionary, hash_list_array, hash_fixed_list_array to
use references instead of cloning
- Extract hash_single_array() helper for common logic

---------

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
(cherry picked from commit a899ca0)
LiaCastaneda pushed a commit to DataDog/datafusion that referenced this pull request Dec 17, 2025
…nfrastructure (apache#18449)

This PR is part of an EPIC to push down hash table references from
HashJoinExec into scans. The EPIC is tracked in
apache#17171.

A "target state" is tracked in
apache#18393.
There is a series of PRs to get us to this target state in smaller more
reviewable changes that are still valuable on their own:
- apache#18448
- (This PR): apache#18449 (depends on
apache#18448)
- apache#18451

- Enhance InListExpr to efficiently store homogeneous lists as arrays
and avoid a conversion to Vec<PhysicalExpr>
  by adding an internal InListStorage enum with Array and Exprs variants
- Re-use existing hashing and comparison utilities to support Struct
arrays and other complex types
- Add public function `in_list_from_array(expr, list_array, negated)`
for creating InList from arrays

Although the diff looks large most of it is actually tests and docs. I
think the actual code change is a negative LOC change, or at least
negative complexity (eliminates a trait, a macro, matching on data
types).

---------

Co-authored-by: David Hewitt <mail@davidhewitt.dev>
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
(cherry picked from commit 486c5d8)
LiaCastaneda added a commit to DataDog/datafusion that referenced this pull request Dec 17, 2025
…for more precise filters (apache#18451)

This PR is part of an EPIC to push down hash table references from
HashJoinExec into scans. The EPIC is tracked in
apache#17171.

A "target state" is tracked in
apache#18393.
There is a series of PRs to get us to this target state in smaller more
reviewable changes that are still valuable on their own:
- apache#18448
- apache#18449 (depends on
apache#18448)
- (This PR): apache#18451

This PR refactors state management in HashJoinExec to make filter
pushdown more efficient and prepare for pushing down membership tests.

- Refactor internal data structures to clean up state management and
make usage more idiomatic (use `Option` instead of comparing integers,
etc.)
- Uses CASE expressions to evaluate pushed-down filters selectively by
partition Example: `CASE hash_repartition % N WHEN partition_id THEN
condition ELSE false END`

---------

Co-authored-by: Lía Adriana <lia.castaneda@datadoghq.com>
(cherry picked from commit 5b0aa37)
LiaCastaneda pushed a commit to DataDog/datafusion that referenced this pull request Dec 17, 2025
… on the size of the build side (apache#18393)

This PR is part of an EPIC to push down hash table references from
HashJoinExec into scans. The EPIC is tracked in
apache#17171.

A "target state" is tracked in
apache#18393 (*this PR*).
There is a series of PRs to get us to this target state in smaller more
reviewable changes that are still valuable on their own:
- apache#18448
- apache#18449 (depends on
apache#18448)
- apache#18451

As those are merged I will rebase this PR to keep track of the
"remaining work", and we can use this PR to explore big picture ideas or
benchmarks of the final state.

(cherry picked from commit c0e8bb5)
LiaCastaneda pushed a commit to DataDog/datafusion that referenced this pull request Dec 17, 2025
This PR is part of an EPIC to push down hash table references from
HashJoinExec into scans. The EPIC is tracked in
apache#17171.

A "target state" is tracked in
apache#18393.
There is a series of PRs to get us to this target state in smaller more
reviewable changes that are still valuable on their own:
- (This PR): apache#18448
- apache#18449 (depends on
apache#18448)
- apache#18451

Change create_hashes and related functions to work with &dyn Array
references instead of requiring ArrayRef (Arc-wrapped arrays). This
avoids unnecessary Arc::clone() calls and enables calls that only have
an &dyn Array to use the hashing utilities.

- Add create_hashes_from_arrays(&[&dyn Array]) function
- Refactor hash_dictionary, hash_list_array, hash_fixed_list_array to
use references instead of cloning
- Extract hash_single_array() helper for common logic

---------

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
(cherry picked from commit a899ca0)
(cherry picked from commit e53debb)
LiaCastaneda pushed a commit to DataDog/datafusion that referenced this pull request Dec 17, 2025
…nfrastructure (apache#18449)

This PR is part of an EPIC to push down hash table references from
HashJoinExec into scans. The EPIC is tracked in
apache#17171.

A "target state" is tracked in
apache#18393.
There is a series of PRs to get us to this target state in smaller more
reviewable changes that are still valuable on their own:
- apache#18448
- (This PR): apache#18449 (depends on
apache#18448)
- apache#18451

- Enhance InListExpr to efficiently store homogeneous lists as arrays
and avoid a conversion to Vec<PhysicalExpr>
  by adding an internal InListStorage enum with Array and Exprs variants
- Re-use existing hashing and comparison utilities to support Struct
arrays and other complex types
- Add public function `in_list_from_array(expr, list_array, negated)`
for creating InList from arrays

Although the diff looks large most of it is actually tests and docs. I
think the actual code change is a negative LOC change, or at least
negative complexity (eliminates a trait, a macro, matching on data
types).

---------

Co-authored-by: David Hewitt <mail@davidhewitt.dev>
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
(cherry picked from commit 486c5d8)
(cherry picked from commit 181e058)
LiaCastaneda added a commit to DataDog/datafusion that referenced this pull request Dec 17, 2025
…for more precise filters (apache#18451)

This PR is part of an EPIC to push down hash table references from
HashJoinExec into scans. The EPIC is tracked in
apache#17171.

A "target state" is tracked in
apache#18393.
There is a series of PRs to get us to this target state in smaller more
reviewable changes that are still valuable on their own:
- apache#18448
- apache#18449 (depends on
apache#18448)
- (This PR): apache#18451

This PR refactors state management in HashJoinExec to make filter
pushdown more efficient and prepare for pushing down membership tests.

- Refactor internal data structures to clean up state management and
make usage more idiomatic (use `Option` instead of comparing integers,
etc.)
- Uses CASE expressions to evaluate pushed-down filters selectively by
partition Example: `CASE hash_repartition % N WHEN partition_id THEN
condition ELSE false END`

---------

Co-authored-by: Lía Adriana <lia.castaneda@datadoghq.com>
(cherry picked from commit 5b0aa37)
(cherry picked from commit e9d1985)
LiaCastaneda pushed a commit to DataDog/datafusion that referenced this pull request Dec 17, 2025
… on the size of the build side (apache#18393)

This PR is part of an EPIC to push down hash table references from
HashJoinExec into scans. The EPIC is tracked in
apache#17171.

A "target state" is tracked in
apache#18393 (*this PR*).
There is a series of PRs to get us to this target state in smaller more
reviewable changes that are still valuable on their own:
- apache#18448
- apache#18449 (depends on
apache#18448)
- apache#18451

As those are merged I will rebase this PR to keep track of the
"remaining work", and we can use this PR to explore big picture ideas or
benchmarks of the final state.

(cherry picked from commit c0e8bb5)
(cherry picked from commit 115313c)
LiaCastaneda pushed a commit to DataDog/datafusion that referenced this pull request Dec 22, 2025
…nfrastructure (apache#18449)

This PR is part of an EPIC to push down hash table references from
HashJoinExec into scans. The EPIC is tracked in
apache#17171.

A "target state" is tracked in
apache#18393.
There is a series of PRs to get us to this target state in smaller more
reviewable changes that are still valuable on their own:
- apache#18448
- (This PR): apache#18449 (depends on
apache#18448)
- apache#18451

- Enhance InListExpr to efficiently store homogeneous lists as arrays
and avoid a conversion to Vec<PhysicalExpr>
  by adding an internal InListStorage enum with Array and Exprs variants
- Re-use existing hashing and comparison utilities to support Struct
arrays and other complex types
- Add public function `in_list_from_array(expr, list_array, negated)`
for creating InList from arrays

Although the diff looks large most of it is actually tests and docs. I
think the actual code change is a negative LOC change, or at least
negative complexity (eliminates a trait, a macro, matching on data
types).

---------

Co-authored-by: David Hewitt <mail@davidhewitt.dev>
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
(cherry picked from commit 486c5d8)
LiaCastaneda added a commit to DataDog/datafusion that referenced this pull request Dec 22, 2025
…for more precise filters (apache#18451)

## Background

This PR is part of an EPIC to push down hash table references from
HashJoinExec into scans. The EPIC is tracked in
apache#17171.

A "target state" is tracked in
apache#18393.
There is a series of PRs to get us to this target state in smaller more
reviewable changes that are still valuable on their own:
- apache#18448
- apache#18449 (depends on
apache#18448)
- (This PR): apache#18451

## Changes in this PR

This PR refactors state management in HashJoinExec to make filter
pushdown more efficient and prepare for pushing down membership tests.

- Refactor internal data structures to clean up state management and
make usage more idiomatic (use `Option` instead of comparing integers,
etc.)
- Uses CASE expressions to evaluate pushed-down filters selectively by
partition Example: `CASE hash_repartition % N WHEN partition_id THEN
condition ELSE false END`

---------

Co-authored-by: Lía Adriana <lia.castaneda@datadoghq.com>
(cherry picked from commit 5b0aa37)
LiaCastaneda pushed a commit to DataDog/datafusion that referenced this pull request Dec 22, 2025
… on the size of the build side (apache#18393)

This PR is part of an EPIC to push down hash table references from
HashJoinExec into scans. The EPIC is tracked in
apache#17171.

A "target state" is tracked in
apache#18393 (*this PR*).
There is a series of PRs to get us to this target state in smaller more
reviewable changes that are still valuable on their own:
- apache#18448
- apache#18449 (depends on
apache#18448)
- apache#18451

As those are merged I will rebase this PR to keep track of the
"remaining work", and we can use this PR to explore big picture ideas or
benchmarks of the final state.

(cherry picked from commit c0e8bb5)
LiaCastaneda added a commit to DataDog/datafusion that referenced this pull request Jan 15, 2026
* Refactor InListExpr to support structs by re-using existing hashing infrastructure  (apache#18449)

This PR is part of an EPIC to push down hash table references from
HashJoinExec into scans. The EPIC is tracked in
apache#17171.

A "target state" is tracked in
apache#18393.
There is a series of PRs to get us to this target state in smaller more
reviewable changes that are still valuable on their own:
- apache#18448
- (This PR): apache#18449 (depends on
apache#18448)
- apache#18451

- Enhance InListExpr to efficiently store homogeneous lists as arrays
and avoid a conversion to Vec<PhysicalExpr>
  by adding an internal InListStorage enum with Array and Exprs variants
- Re-use existing hashing and comparison utilities to support Struct
arrays and other complex types
- Add public function `in_list_from_array(expr, list_array, negated)`
for creating InList from arrays

Although the diff looks large most of it is actually tests and docs. I
think the actual code change is a negative LOC change, or at least
negative complexity (eliminates a trait, a macro, matching on data
types).

---------

Co-authored-by: David Hewitt <mail@davidhewitt.dev>
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
(cherry picked from commit 486c5d8)

* feat: Add evaluate_to_arrays function (apache#18446)

## Which issue does this PR close?

<!--
We generally require a GitHub issue to be filed for all bug fixes and
enhancements and this helps us generate change logs for our releases.
You can link an issue to this PR using the GitHub syntax. For example
`Closes apache#123` indicates that this PR will close issue apache#123.
-->

- Closes apache#18330 .

## Rationale for this change

<!--
Why are you proposing this change? If this is already explained clearly
in the issue then this section is not needed.
Explaining clearly why changes are proposed helps reviewers understand
your changes and offer better suggestions for fixes.
-->

Reduce code duplication.

## What changes are included in this PR?

<!--
There is no need to duplicate the description in the issue here but it
is sometimes worth providing a summary of the individual changes in this
PR.
-->

A util function replacing many calls which are using the same code.

## Are these changes tested?

<!--
We typically require tests for all PRs in order to:
1. Prevent the code from being accidentally broken by subsequent changes
2. Serve as another way to document the expected behavior of the code

If tests are not included in your PR, please explain why (for example,
are they covered by existing tests)?
-->

No logic should change whatsoever, so each area which now uses this code
should have it's own tests and benchmarks unmodified.

## Are there any user-facing changes?

<!--
If there are user-facing changes then we may require documentation to be
updated before approving the PR.
-->

<!--
If there are any breaking changes to public APIs, please add the `api
change` label.
-->

Yes, there is now a new pub function.
No other changes to API.

---------

Co-authored-by: Martin Grigorov <martin-g@users.noreply.github.com>
(cherry picked from commit 76b4156)

* Refactor state management in `HashJoinExec` and use CASE expressions for more precise filters (apache#18451)

## Background

This PR is part of an EPIC to push down hash table references from
HashJoinExec into scans. The EPIC is tracked in
apache#17171.

A "target state" is tracked in
apache#18393.
There is a series of PRs to get us to this target state in smaller more
reviewable changes that are still valuable on their own:
- apache#18448
- apache#18449 (depends on
apache#18448)
- (This PR): apache#18451

## Changes in this PR

This PR refactors state management in HashJoinExec to make filter
pushdown more efficient and prepare for pushing down membership tests.

- Refactor internal data structures to clean up state management and
make usage more idiomatic (use `Option` instead of comparing integers,
etc.)
- Uses CASE expressions to evaluate pushed-down filters selectively by
partition Example: `CASE hash_repartition % N WHEN partition_id THEN
condition ELSE false END`

---------

Co-authored-by: Lía Adriana <lia.castaneda@datadoghq.com>
(cherry picked from commit 5b0aa37)

* Push down InList or hash table references from HashJoinExec depending on the size of the build side (apache#18393)

This PR is part of an EPIC to push down hash table references from
HashJoinExec into scans. The EPIC is tracked in
apache#17171.

A "target state" is tracked in
apache#18393 (*this PR*).
There is a series of PRs to get us to this target state in smaller more
reviewable changes that are still valuable on their own:
- apache#18448
- apache#18449 (depends on
apache#18448)
- apache#18451

As those are merged I will rebase this PR to keep track of the
"remaining work", and we can use this PR to explore big picture ideas or
benchmarks of the final state.

(cherry picked from commit c0e8bb5)

* fmt

* replace HashTableLookupExpr with lit(true) in proto serialization (apache#19300)
*errors* when serializing now, and would break any users using joins +
protobuf.

---------

Co-authored-by: Adrian Garcia Badaracco <1755071+adriangb@users.noreply.github.com>
Co-authored-by: David Hewitt <mail@davidhewitt.dev>
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
Co-authored-by: Emily Matheys <55631053+EmilyMatt@users.noreply.github.com>
Co-authored-by: Martin Grigorov <martin-g@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

common Related to common crate physical-expr Changes to the physical-expr crates physical-plan Changes to the physical-plan crate sqllogictest SQL Logic Tests (.slt)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants