Skip to content

Conversation

@blaginin
Copy link
Collaborator

sync main

emilk and others added 30 commits September 17, 2025 21:52
* Use Display formatting for DataTypes where I could find them

* fix

* More places

* Less Debug

* Cargo fmt

* More cleanup

* Plural types as Display

* Fixes

* Update some more tests and error messages

* Update test snapshot

* last (?) fixes

* update another slt

* Update instructions on how to run the tests

* Ignore pending snapshot files in .gitignore

* Running all the tests is so slow

* just a trailing space

* Update another test

* Fix markdown formatting

* Improve Display for NativeType

* Update code related to error reporting of NativeType

* Revert some formatting

* fixelyfix

* Another snapshot update
* Move GSOC content to its own section

* Update to 20205
* feat: Add `OR REPLACE` to creating external tables

* regen

* fmt

* make more explicit + add tests

* clipy fix

---------

Co-authored-by: Dmitrii Blaginin <dmitrii@blaginin.me>
* chore: mv `DistinctSumAccumulator` to common

* feat: add avg distinct support for float64 type

* chore: fmt

* refactor: update import for DataType in Float64DistinctAvgAccumulator and remove unused sum_distinct module

* feat: add avg distinct support for float64 type

* feat: add avg distinct support for decimal

* feat: more test for avg distinct in rust api

* Remove DataFrame API tests for avg(distinct)

* Remove proto test

* Fix merge errors

* Refactoring

* Minor cleanup

* Decimal slt tests for avg(distinct)

* Fix state_fields for decimal distinct avg

---------

Co-authored-by: YuNing Chen <admin@ynchen.me>
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
Co-authored-by: Dmitrii Blaginin <dmitrii@blaginin.me>
Bumps [taiki-e/install-action](https://github.com/taiki-e/install-action) from 2.61.8 to 2.61.9.
- [Release notes](https://github.com/taiki-e/install-action/releases)
- [Changelog](https://github.com/taiki-e/install-action/blob/main/CHANGELOG.md)
- [Commits](taiki-e/install-action@2fdc5fd...8ea3248)

---
updated-dependencies:
- dependency-name: taiki-e/install-action
  dependency-version: 2.61.9
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps [Swatinem/rust-cache](https://github.com/swatinem/rust-cache) from 2.8.0 to 2.8.1.
- [Release notes](https://github.com/swatinem/rust-cache/releases)
- [Changelog](https://github.com/Swatinem/rust-cache/blob/master/CHANGELOG.md)
- [Commits](Swatinem/rust-cache@98c8021...f13886b)

---
updated-dependencies:
- dependency-name: Swatinem/rust-cache
  dependency-version: 2.8.1
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
…#17029)

* use GreedyMemoryPool for sanity check

* validate whether batch read from spill exceeds max_record_batch_mem

* replace err with warn log
* fix(SubqueryAlias): use maybe_project_redundant_column

Fixes #17405

* chore: format

* ci: retry

* chore(SubqueryAlias): restructore duplicate detection and add tests

* docs: add examples and context to the reproducer
* optimizer: Convert to Hash Join for join predicates like 'a IS NOT DISTINCT FROM b'

* drop tables in slt

* fix rust doc

* Update datafusion/optimizer/src/extract_equijoin_predicate.rs

Co-authored-by: Jonathan Chen <chenleejonathan@gmail.com>

* Update datafusion/optimizer/src/extract_equijoin_predicate.rs

* Update datafusion/sqllogictest/test_files/join_is_not_distinct_from.slt

Co-authored-by: Nga Tran <nga-tran@live.com>

* review: more tests and better error message

* review: improve doc

---------

Co-authored-by: Jonathan Chen <chenleejonathan@gmail.com>
Co-authored-by: Nga Tran <nga-tran@live.com>
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
* Update to arrow/parquet 56.1.0

* Adjust for new parquet sizes, update for deprecated API

* Thread through max_predicate_cache_size, add test
…xpression (#17525)

* [ISSUE 17425] Initial attempt to fix this problem

* Add tests for the fix

* Require that the metadata of values in VALUES clause must be identical

* fix merge error

---------

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
Bumps [serde](https://github.com/serde-rs/serde) from 1.0.223 to 1.0.225.
- [Release notes](https://github.com/serde-rs/serde/releases)
- [Commits](serde-rs/serde@v1.0.223...v1.0.225)

---
updated-dependencies:
- dependency-name: serde
  dependency-version: 1.0.225
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Dmitrii Blaginin <dmitrii@blaginin.me>
* chore: update dynamic filter formatting to indicate expr is placeholder

* update tests

* update tests
Bumps [taiki-e/install-action](https://github.com/taiki-e/install-action) from 2.61.9 to 2.61.10.
- [Release notes](https://github.com/taiki-e/install-action/releases)
- [Changelog](https://github.com/taiki-e/install-action/blob/main/CHANGELOG.md)
- [Commits](taiki-e/install-action@8ea3248...0aa4f22)

---
updated-dependencies:
- dependency-name: taiki-e/install-action
  dependency-version: 2.61.10
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
…fusion dev dependency (#17656)

* minor: Ensure `proto` crate has datetime & unicode expr flags in datafusion dev dependency

* toml formatting
Bumps [indexmap](https://github.com/indexmap-rs/indexmap) from 2.11.3 to 2.11.4.
- [Changelog](https://github.com/indexmap-rs/indexmap/blob/main/RELEASES.md)
- [Commits](indexmap-rs/indexmap@2.11.3...2.11.4)

---
updated-dependencies:
- dependency-name: indexmap
  dependency-version: 2.11.4
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
… sized `FixedSizeBinary` arguments (#17531)

* Introduce wildcard const for FixedSizeBinary type signature

* Add Binary to TypeSignatureClass

* Remove FIXED_SIZE_BINARY_WILDCARD
* docs: deduplicate links in `introduction.md`

* Further simplifications

* Fix
* Add committers explicitly to governance page, with script

* add license header

* Update Wes McKinney's affiliation in governance.md

* Update adriangb's affiliation

* Update affiliation

* Andy Grove Affiliation

* Update Qi Zhu affiliation

* Updatd linwei's info

* Update docs/source/contributor-guide/governance.md

* Update docs/source/contributor-guide/governance.md

* Apply suggestions from code review

Co-authored-by: Oleks V <comphead@users.noreply.github.com>
Co-authored-by: Liang-Chi Hsieh <viirya@gmail.com>

* Apply suggestions from code review

Co-authored-by: Alex Huang <huangweijun1001@gmail.com>
Co-authored-by: Yang Jiang <jiangyang381@163.com>
Co-authored-by: Yongting You <2010youy01@gmail.com>

* Apply suggestions from code review

Co-authored-by: Yijie Shen <henry.yijieshen@gmail.com>

* Apply suggestions from code review

* Apply suggestions from code review

Co-authored-by: Brent Gardner <bgardner@squarelabs.net>
Co-authored-by: Dmitrii Blaginin <github@blaginin.me>
Co-authored-by: Jax Liu <liugs963@gmail.com>
Co-authored-by: Ifeanyi Ubah <ify1992@yahoo.com>

* Apply suggestions from code review

Co-authored-by: Will Jones <willjones127@gmail.com>

* Clarify what is updated in the script

* Apply suggestions from code review

Co-authored-by: Paddy Horan <5733408+paddyhoran@users.noreply.github.com>
Co-authored-by: Dan Harris <1327726+thinkharderdev@users.noreply.github.com>

* Update docs/source/contributor-guide/governance.md

* Update docs/source/contributor-guide/governance.md

Co-authored-by: Parth Chandra <parthc@apache.org>

* Update docs/source/contributor-guide/governance.md

* prettier

---------

Co-authored-by: Wes McKinney <wesm@apache.org>
Co-authored-by: Adrian Garcia Badaracco <1755071+adriangb@users.noreply.github.com>
Co-authored-by: Mustafa Akur <akurmustafa@gmail.com>
Co-authored-by: Qi Zhu <821684824@qq.com>
Co-authored-by: 张林伟 <lewiszlw520@gmail.com>
Co-authored-by: xudong.w <wxd963996380@gmail.com>
Co-authored-by: Oleks V <comphead@users.noreply.github.com>
Co-authored-by: Liang-Chi Hsieh <viirya@gmail.com>
Co-authored-by: Alex Huang <huangweijun1001@gmail.com>
Co-authored-by: Yang Jiang <jiangyang381@163.com>
Co-authored-by: Yongting You <2010youy01@gmail.com>
Co-authored-by: Yijie Shen <henry.yijieshen@gmail.com>
Co-authored-by: Brent Gardner <bgardner@squarelabs.net>
Co-authored-by: Dmitrii Blaginin <github@blaginin.me>
Co-authored-by: Jax Liu <liugs963@gmail.com>
Co-authored-by: Ifeanyi Ubah <ify1992@yahoo.com>
Co-authored-by: Will Jones <willjones127@gmail.com>
Co-authored-by: Paddy Horan <5733408+paddyhoran@users.noreply.github.com>
Co-authored-by: Dan Harris <1327726+thinkharderdev@users.noreply.github.com>
Co-authored-by: Ruihang Xia <waynestxia@gmail.com>
Co-authored-by: Parth Chandra <parthc@apache.org>
* Support Decimal32/64 types

* Fix bugs, tests, handle more aggregate functions and schema

* Fill out more parts in expr,common and expr-common

* Some stragglers and overlooked corners

* Actually commit the avg_distinct support

---------

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
…ut (#17664)

* Clarify null-equal explain expectations

* Format null equality display strings

* fix test

* review: more concise message

* review: more concise message
Adds support for `ScalarValue::Time64Microsecond` and `ScalarValue::Time64Nanosecond` to be converted to and from Substrait literals. This includes the `PrecisionTime` literal type and specific `TIME_64_TYPE_VARIATION_REF` for 6-digit (microseconds) and 9-digit (nanoseconds) precision.

Co-authored-by: Bruno Volpato <bruno.volpato@datadoghq.com>
* feat(spark): implement Spark `map` function `map_from_entries`

* fix: map_from_entries with null entries in lists, chore: refactor initial offsets, add tests
* feat: Add Hash Join benchmarks

* fmt

* Update benchmarks/README.md

Co-authored-by: Adrian Garcia Badaracco <1755071+adriangb@users.noreply.github.com>

* add benchmarks

* update selectivities

* fix the error introduced when merging main

---------

Co-authored-by: Adrian Garcia Badaracco <1755071+adriangb@users.noreply.github.com>
Co-authored-by: Yongting You <2010youy01@gmail.com>
Bumps [thiserror](https://github.com/dtolnay/thiserror) from 2.0.16 to 2.0.17.
- [Release notes](https://github.com/dtolnay/thiserror/releases)
- [Commits](dtolnay/thiserror@2.0.16...2.0.17)

---
updated-dependencies:
- dependency-name: thiserror
  dependency-version: 2.0.17
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps [quote](https://github.com/dtolnay/quote) from 1.0.40 to 1.0.41.
- [Release notes](https://github.com/dtolnay/quote/releases)
- [Commits](dtolnay/quote@1.0.40...1.0.41)

---
updated-dependencies:
- dependency-name: quote
  dependency-version: 1.0.41
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps [taiki-e/install-action](https://github.com/taiki-e/install-action) from 2.62.9 to 2.62.12.
- [Release notes](https://github.com/taiki-e/install-action/releases)
- [Changelog](https://github.com/taiki-e/install-action/blob/main/CHANGELOG.md)
- [Commits](taiki-e/install-action@71d339e...5ab3094)

---
updated-dependencies:
- dependency-name: taiki-e/install-action
  dependency-version: 2.62.12
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps [serde](https://github.com/serde-rs/serde) from 1.0.227 to 1.0.228.
- [Release notes](https://github.com/serde-rs/serde/releases)
- [Commits](serde-rs/serde@v1.0.227...v1.0.228)

---
updated-dependencies:
- dependency-name: serde
  dependency-version: 1.0.228
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
…836)

Bumps [taiki-e/install-action](https://github.com/taiki-e/install-action) from 2.62.12 to 2.62.13.
- [Release notes](https://github.com/taiki-e/install-action/releases)
- [Changelog](https://github.com/taiki-e/install-action/blob/main/CHANGELOG.md)
- [Commits](taiki-e/install-action@5ab3094...d0f4f69)

---
updated-dependencies:
- dependency-name: taiki-e/install-action
  dependency-version: 2.62.13
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.