Vector Verification: Rework to run based on env variable DUCKDB_DEBUG_VERIFY_VECTOR and move to Main.yml#16957
Merged
Mytherin merged 8 commits intoduckdb:mainfrom Apr 2, 2025
Merged
Conversation
… it at compile time
… for enum in Value::Numeric
Contributor
|
+1 on moving to this to ENV variable |
krlmlr
added a commit
to duckdb/duckdb-r
that referenced
this pull request
May 15, 2025
Vector Verification: Rework to run based on env variable `DUCKDB_DEBUG_VERIFY_VECTOR` and move to `Main.yml` (duckdb/duckdb#16957)
krlmlr
added a commit
to duckdb/duckdb-r
that referenced
this pull request
May 15, 2025
Vector Verification: Rework to run based on env variable `DUCKDB_DEBUG_VERIFY_VECTOR` and move to `Main.yml` (duckdb/duckdb#16957)
krlmlr
added a commit
to duckdb/duckdb-r
that referenced
this pull request
May 16, 2025
Vector Verification: Rework to run based on env variable `DUCKDB_DEBUG_VERIFY_VECTOR` and move to `Main.yml` (duckdb/duckdb#16957)
krlmlr
added a commit
to duckdb/duckdb-r
that referenced
this pull request
May 16, 2025
Vector Verification: Rework to run based on env variable `DUCKDB_DEBUG_VERIFY_VECTOR` and move to `Main.yml` (duckdb/duckdb#16957)
krlmlr
added a commit
to duckdb/duckdb-r
that referenced
this pull request
May 17, 2025
Vector Verification: Rework to run based on env variable `DUCKDB_DEBUG_VERIFY_VECTOR` and move to `Main.yml` (duckdb/duckdb#16957)
Mytherin
added a commit
that referenced
this pull request
Jul 16, 2025
#16957 fixed the bug in `Vector::Reinterpret` for dictionary vectors by creating a new child buffer every time. However, creating a new child buffer is much more expensive than copying a shared_ptr, which results in about 3% performance regression in TPC-H benchmarks. The main issue is that a dictionary vector cannot directly copy the child buffer if the types don't match, as this would make the child vector's type inconsistent. So we can check whether the types are different and only create a new child buffer when the types are not equal - which isn't the case in most scenarios. benchmark result: ``` python scripts/regression/test_runner.py --old build/release-old/benchmark/benchmark_runner --new build/release/benchmark/benchmark_runner --benchmarks .github/regression/tpch.csv --threads 2 Old timing geometric mean: 0.02501917589777838 New timing geometric mean: 0.024095980407583723, roughly 3% faster python scripts/regression/test_runner.py --old build/release-old/benchmark/benchmark_runner --new build/release/benchmark/benchmark_runner --benchmarks .github/regression/tpcds.csv --threads 2 Old timing geometric mean: 0.021899101352704627 New timing geometric mean: 0.021254155592682008, roughly 2% faster ```
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR modifies the vector verification tests so they run based on an environment variable, e.g.:
This then allows all of the vector verification tests to be run in a single workflow instead of requiring multiple re-compilations.
We move this workflow from nightly to
Main.ymlas well.In addition, we fix a minor bug with
Reinterpreton dictionary vectors which fixes an issue in the sequence operator verification run.