Skip to content

Fix DISTINCT query with JOIN on multiple segmentby columns#5680

Merged
konskov merged 1 commit intotimescale:mainfrom
konskov:distinct_compress_fix_5585
May 17, 2023
Merged

Fix DISTINCT query with JOIN on multiple segmentby columns#5680
konskov merged 1 commit intotimescale:mainfrom
konskov:distinct_compress_fix_5585

Conversation

@konskov
Copy link
Copy Markdown
Contributor

@konskov konskov commented May 11, 2023

Previously when adding equivalence class members for the compressed
chunk's variables, we would only consider Vars. This led us to ignore
cases where the Var was wrapped in a RelabelType,
returning inaccurate results.

Fixed the issue by accepting Vars
with RelabelType for segmentby equivalence class.

Fixes #5585

@codecov
Copy link
Copy Markdown

codecov bot commented May 11, 2023

Codecov Report

Merging #5680 (547c191) into main (fb65086) will increase coverage by 3.26%.
The diff coverage is 100.00%.

❗ Current head 547c191 differs from pull request most recent head 07433cc. Consider uploading reports for the commit 07433cc to get more accurate results

@@            Coverage Diff             @@
##             main    #5680      +/-   ##
==========================================
+ Coverage   87.71%   90.98%   +3.26%     
==========================================
  Files         231      230       -1     
  Lines       54803    54624     -179     
  Branches    12055        0   -12055     
==========================================
+ Hits        48073    49697    +1624     
- Misses       4902     4927      +25     
+ Partials     1828        0    -1828     
Impacted Files Coverage Δ
tsl/src/nodes/decompress_chunk/decompress_chunk.c 94.52% <100.00%> (+4.00%) ⬆️

... and 190 files with indirect coverage changes

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@konskov konskov force-pushed the distinct_compress_fix_5585 branch 3 times, most recently from c902339 to df3e4cd Compare May 11, 2023 12:24
Comment on lines +8595 to +8603
-- github issue 5585
create table kon (
time timestamptz not null,
a varchar(255) not null,
b int,
c int
);
SELECT create_hypertable('kon', 'time');
WARNING: column type "character varying" used for "a" does not follow best practices
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think about putting this into a separate file? This one's output is already 9k lines, and it's versioned, it's becoming unwieldy.

Copy link
Copy Markdown
Contributor Author

@konskov konskov May 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure, I will create a new test file for this, is transparent_decompression_idx_join a good name? and should this test be versioned as well? I think it would be good to make it a versioned test since transparent_decompression is too
Edit: I see we already have transparent_decompression_ordered_index which looks like a better place to add this test

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, that one is smaller. About versions, I try to avoid them when possible, because they are inconvenient to work with.

Copy link
Copy Markdown
Member

@akuzm akuzm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't look like it would harm. Just don't forget to add the proper node type checks, now you have a raw cast.

@konskov konskov force-pushed the distinct_compress_fix_5585 branch 4 times, most recently from dee2300 to 832ebea Compare May 12, 2023 07:26
@konskov konskov changed the title Accept RelabelType for segby equivclass Fix DISTINCT query with JOIN on multiple segmentby columns May 12, 2023
@konskov konskov marked this pull request as ready for review May 12, 2023 08:32
@github-actions github-actions bot requested review from gayyappan and nikkhils May 12, 2023 08:32
@github-actions
Copy link
Copy Markdown

@gayyappan, @nikkhils: please review this pull request.

Powered by pull-review

@konskov konskov force-pushed the distinct_compress_fix_5585 branch 3 times, most recently from 547c191 to 608a5d3 Compare May 15, 2023 06:14
@konskov
Copy link
Copy Markdown
Contributor Author

konskov commented May 15, 2023

Thank you for reviewing @akuzm! Actually I’m not sure what node type checks to add and where, could you point them out if it’s not too much trouble? Thank you so much!

@akuzm
Copy link
Copy Markdown
Member

akuzm commented May 16, 2023

Thank you for reviewing @akuzm! Actually I’m not sure what node type checks to add and where, could you point them out if it’s not too much trouble? Thank you so much!

I just meant the IsA, I think you added it already.

@konskov konskov force-pushed the distinct_compress_fix_5585 branch 2 times, most recently from 4348919 to 4a31991 Compare May 16, 2023 11:40
@konskov konskov enabled auto-merge (rebase) May 16, 2023 12:42
@konskov konskov disabled auto-merge May 16, 2023 12:50
@konskov konskov enabled auto-merge (rebase) May 16, 2023 12:50
-- force an index scan
set enable_seqscan = 'off';
-- disable jit to avoid test flakiness
set jit_above_cost = -1;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
set jit_above_cost = -1;
set jit = off;

This is the correct way to complete turn off JIT

-- force an index scan
set enable_seqscan = 'off';
-- disable jit to avoid test flakiness
set jit_above_cost = -1;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
set jit_above_cost = -1;
set jit = off;

This is the correct way to complete turn off JIT

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed, thank you

@konskov konskov disabled auto-merge May 16, 2023 18:22
Previously when adding equivalence class members for the compressed
chunk's variables, we would only consider Vars. This led us to ignore
cases where the Var was wrapped in a RelabelType,
returning inaccurate results.

Fixed the issue by accepting Vars
with RelabelType for segmentby equivalence class.

Fixes timescale#5585
@konskov konskov force-pushed the distinct_compress_fix_5585 branch from 4a31991 to 07433cc Compare May 17, 2023 05:16
@konskov konskov merged commit 19dd7bb into timescale:main May 17, 2023
@timescale-automation
Copy link
Copy Markdown
Member

Automated backport to 2.10.x not done: cherry-pick failed.

Git status

HEAD detached at origin/2.10.x
You are currently cherry-picking commit 19dd7bbd.
  (fix conflicts and run "git cherry-pick --continue")
  (use "git cherry-pick --skip" to skip this patch)
  (use "git cherry-pick --abort" to cancel the cherry-pick operation)

Changes to be committed:
	modified:   CHANGELOG.md
	modified:   tsl/src/nodes/decompress_chunk/decompress_chunk.c
	new file:   tsl/test/expected/transparent_decompression_join_index.out
	new file:   tsl/test/sql/transparent_decompression_join_index.sql

Unmerged paths:
  (use "git add <file>..." to mark resolution)
	both modified:   tsl/test/sql/CMakeLists.txt


Job log

@timescale-automation timescale-automation added the auto-backport-not-done Automated backport of this PR has failed non-retriably (e.g. conflicts) label May 17, 2023
@konskov konskov added this to the TimescaleDB 2.11.2 milestone Aug 3, 2023
konskov added a commit to konskov/timescaledb that referenced this pull request Aug 9, 2023
This release contains bug fixes since the 2.11.0 release.
We recommend that you upgrade at the next available opportunity.

**Features**
* timescale#5909 CREATE INDEX ONLY ON hypertable creates index on chunks
* timescale#5923 Feature flags for TimescaleDB features
**Bugfixes**
* timescale#5680 Fix DISTINCT query with JOIN on multiple segmentby columns
* timescale#5774 Fixed two bugs in decompression sorted merge code
* timescale#5786 Ensure pg_config --cppflags are passed
* timescale#5906 Fix quoting owners in sql scripts.
* timescale#5912 Fix crash in 1-step integer policy creation
**Thanks**
* @mrksngl for submitting a PR to fix extension upgrade scripts
* @ericdevries for reporting an issue with DISTINCT queries using
segmentby columns of compressed hypertable
konskov added a commit to konskov/timescaledb that referenced this pull request Aug 9, 2023
This release contains bug fixes since the 2.11.1 release.
We recommend that you upgrade at the next available opportunity.

**Features**
* timescale#5909 CREATE INDEX ONLY ON hypertable creates index on chunks
* timescale#5923 Feature flags for TimescaleDB features
**Bugfixes**
* timescale#5680 Fix DISTINCT query with JOIN on multiple segmentby columns
* timescale#5774 Fixed two bugs in decompression sorted merge code
* timescale#5786 Ensure pg_config --cppflags are passed
* timescale#5906 Fix quoting owners in sql scripts.
* timescale#5912 Fix crash in 1-step integer policy creation
**Thanks**
* @mrksngl for submitting a PR to fix extension upgrade scripts
* @ericdevries for reporting an issue with DISTINCT queries using
segmentby columns of compressed hypertable
@konskov konskov mentioned this pull request Aug 9, 2023
konskov added a commit to konskov/timescaledb that referenced this pull request Aug 9, 2023
This release contains bug fixes since the 2.11.1 release.
We recommend that you upgrade at the next available opportunity.

**Features**
* timescale#5909 CREATE INDEX ONLY ON hypertable creates index on chunks
* timescale#5923 Feature flags for TimescaleDB features

**Bugfixes**
* timescale#5680 Fix DISTINCT query with JOIN on multiple segmentby columns
* timescale#5774 Fixed two bugs in decompression sorted merge code
* timescale#5786 Ensure pg_config --cppflags are passed
* timescale#5906 Fix quoting owners in sql scripts.
* timescale#5912 Fix crash in 1-step integer policy creation

**Thanks**
* @mrksngl for submitting a PR to fix extension upgrade scripts
* @ericdevries for reporting an issue with DISTINCT queries using
segmentby columns of compressed hypertable
konskov added a commit to konskov/timescaledb that referenced this pull request Aug 9, 2023
This release contains bug fixes since the 2.11.1 release.
We recommend that you upgrade at the next available opportunity.

**Features**
* timescale#5909 CREATE INDEX ONLY ON hypertable creates index on chunks
* timescale#5923 Feature flags for TimescaleDB features

**Bugfixes**
* timescale#5680 Fix DISTINCT query with JOIN on multiple segmentby columns
* timescale#5774 Fixed two bugs in decompression sorted merge code
* timescale#5786 Ensure pg_config --cppflags are passed
* timescale#5906 Fix quoting owners in sql scripts.
* timescale#5912 Fix crash in 1-step integer policy creation

**Thanks**
* @mrksngl for submitting a PR to fix extension upgrade scripts
* @ericdevries for reporting an issue with DISTINCT queries using
segmentby columns of compressed hypertable
konskov added a commit to konskov/timescaledb that referenced this pull request Aug 9, 2023
This release contains bug fixes since the 2.11.1 release.
We recommend that you upgrade at the next available opportunity.

**Features**
* timescale#5909 CREATE INDEX ONLY ON hypertable creates index on chunks
* timescale#5923 Feature flags for TimescaleDB features

**Bugfixes**
* timescale#5680 Fix DISTINCT query with JOIN on multiple segmentby columns
* timescale#5774 Fixed two bugs in decompression sorted merge code
* timescale#5786 Ensure pg_config --cppflags are passed
* timescale#5906 Fix quoting owners in sql scripts.
* timescale#5912 Fix crash in 1-step integer policy creation

**Thanks**
* @mrksngl for submitting a PR to fix extension upgrade scripts
* @ericdevries for reporting an issue with DISTINCT queries using
segmentby columns of compressed hypertable
konskov added a commit to konskov/timescaledb that referenced this pull request Aug 9, 2023
This release contains bug fixes since the 2.11.1 release.
We recommend that you upgrade at the next available opportunity.

**Features**
* timescale#5909 CREATE INDEX ONLY ON hypertable creates index on chunks
* timescale#5923 Feature flags for TimescaleDB features

**Bugfixes**
* timescale#5680 Fix DISTINCT query with JOIN on multiple segmentby columns
* timescale#5774 Fixed two bugs in decompression sorted merge code
* timescale#5786 Ensure pg_config --cppflags are passed
* timescale#5906 Fix quoting owners in sql scripts.
* timescale#5912 Fix crash in 1-step integer policy creation

**Thanks**
* @mrksngl for submitting a PR to fix extension upgrade scripts
* @ericdevries for reporting an issue with DISTINCT queries using
segmentby columns of compressed hypertable
konskov added a commit to konskov/timescaledb that referenced this pull request Aug 9, 2023
This release contains bug fixes since the 2.11.1 release.
We recommend that you upgrade at the next available opportunity.

**Features**
* timescale#5923 Feature flags for TimescaleDB features

**Bugfixes**
* timescale#5680 Fix DISTINCT query with JOIN on multiple segmentby columns
* timescale#5774 Fixed two bugs in decompression sorted merge code
* timescale#5786 Ensure pg_config --cppflags are passed
* timescale#5906 Fix quoting owners in sql scripts.
* timescale#5912 Fix crash in 1-step integer policy creation

**Thanks**
* @mrksngl for submitting a PR to fix extension upgrade scripts
* @ericdevries for reporting an issue with DISTINCT queries using
segmentby columns of compressed hypertable
konskov added a commit to konskov/timescaledb that referenced this pull request Aug 9, 2023
This release contains bug fixes since the 2.11.1 release.
We recommend that you upgrade at the next available opportunity.

**Features**
* timescale#5923 Feature flags for TimescaleDB features

**Bugfixes**
* timescale#5680 Fix DISTINCT query with JOIN on multiple segmentby columns
* timescale#5774 Fixed two bugs in decompression sorted merge code
* timescale#5786 Ensure pg_config --cppflags are passed
* timescale#5906 Fix quoting owners in sql scripts.
* timescale#5912 Fix crash in 1-step integer policy creation

**Thanks**
* @mrksngl for submitting a PR to fix extension upgrade scripts
* @ericdevries for reporting an issue with DISTINCT queries using
segmentby columns of compressed hypertable
konskov added a commit that referenced this pull request Aug 9, 2023
This release contains bug fixes since the 2.11.1 release.
We recommend that you upgrade at the next available opportunity.

**Features**
* #5923 Feature flags for TimescaleDB features

**Bugfixes**
* #5680 Fix DISTINCT query with JOIN on multiple segmentby columns
* #5774 Fixed two bugs in decompression sorted merge code
* #5786 Ensure pg_config --cppflags are passed
* #5906 Fix quoting owners in sql scripts.
* #5912 Fix crash in 1-step integer policy creation

**Thanks**
* @mrksngl for submitting a PR to fix extension upgrade scripts
* @ericdevries for reporting an issue with DISTINCT queries using
segmentby columns of compressed hypertable
konskov added a commit to konskov/timescaledb that referenced this pull request Aug 10, 2023
This release contains bug fixes since the 2.11.1 release.
We recommend that you upgrade at the next available opportunity.

**Features**
* timescale#5923 Feature flags for TimescaleDB features

**Bugfixes**
* timescale#5680 Fix DISTINCT query with JOIN on multiple segmentby columns
* timescale#5774 Fixed two bugs in decompression sorted merge code
* timescale#5786 Ensure pg_config --cppflags are passed
* timescale#5906 Fix quoting owners in sql scripts.
* timescale#5912 Fix crash in 1-step integer policy creation

**Thanks**
* @mrksngl for submitting a PR to fix extension upgrade scripts
* @ericdevries for reporting an issue with DISTINCT queries using
segmentby columns of compressed hypertable
konskov added a commit to konskov/timescaledb that referenced this pull request Aug 10, 2023
This release contains bug fixes since the 2.11.1 release.
We recommend that you upgrade at the next available opportunity.

**Features**
* timescale#5923 Feature flags for TimescaleDB features

**Bugfixes**
* timescale#5680 Fix DISTINCT query with JOIN on multiple segmentby columns
* timescale#5774 Fixed two bugs in decompression sorted merge code
* timescale#5786 Ensure pg_config --cppflags are passed
* timescale#5906 Fix quoting owners in sql scripts.
* timescale#5912 Fix crash in 1-step integer policy creation

**Thanks**
* @mrksngl for submitting a PR to fix extension upgrade scripts
* @ericdevries for reporting an issue with DISTINCT queries using
segmentby columns of compressed hypertable
konskov added a commit that referenced this pull request Aug 10, 2023
This release contains bug fixes since the 2.11.1 release.
We recommend that you upgrade at the next available opportunity.

**Features**
* #5923 Feature flags for TimescaleDB features

**Bugfixes**
* #5680 Fix DISTINCT query with JOIN on multiple segmentby columns
* #5774 Fixed two bugs in decompression sorted merge code
* #5786 Ensure pg_config --cppflags are passed
* #5906 Fix quoting owners in sql scripts.
* #5912 Fix crash in 1-step integer policy creation

**Thanks**
* @mrksngl for submitting a PR to fix extension upgrade scripts
* @ericdevries for reporting an issue with DISTINCT queries using
segmentby columns of compressed hypertable
svenklemm pushed a commit that referenced this pull request Aug 15, 2023
This release contains bug fixes since the 2.11.1 release.
We recommend that you upgrade at the next available opportunity.

**Features**
* #5923 Feature flags for TimescaleDB features

**Bugfixes**
* #5680 Fix DISTINCT query with JOIN on multiple segmentby columns
* #5774 Fixed two bugs in decompression sorted merge code
* #5786 Ensure pg_config --cppflags are passed
* #5906 Fix quoting owners in sql scripts.
* #5912 Fix crash in 1-step integer policy creation

**Thanks**
* @mrksngl for submitting a PR to fix extension upgrade scripts
* @ericdevries for reporting an issue with DISTINCT queries using
segmentby columns of compressed hypertable
konskov added a commit that referenced this pull request Aug 16, 2023
This release contains bug fixes since the 2.11.1 release.
We recommend that you upgrade at the next available opportunity.

**Features**
* #5923 Feature flags for TimescaleDB features

**Bugfixes**
* #5680 Fix DISTINCT query with JOIN on multiple segmentby columns
* #5774 Fixed two bugs in decompression sorted merge code
* #5786 Ensure pg_config --cppflags are passed
* #5906 Fix quoting owners in sql scripts.
* #5912 Fix crash in 1-step integer policy creation

**Thanks**
* @mrksngl for submitting a PR to fix extension upgrade scripts
* @ericdevries for reporting an issue with DISTINCT queries using
segmentby columns of compressed hypertable
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

auto-backport-not-done Automated backport of this PR has failed non-retriably (e.g. conflicts)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: Distinct query on multiple columns causes query to ignore join conditions

5 participants