Skip to content

Conversation

@airborne12
Copy link
Member

@airborne12 airborne12 commented Oct 2, 2025

What problem does this PR solve?

Issue Number: close #xxx

Related PR: #56139

Problem Summary:

This PR fixes a bug in NULL bitmap handling for MATCH OR queries in inverted index query. The bug was causing incorrect boolean logic evaluation when combining TRUE and NULL values in OR operations.

Release note

None

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

@hello-stephen
Copy link
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@airborne12
Copy link
Member Author

run buildall

@doris-robot
Copy link

TPC-DS: Total hot run time: 188685 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 54258e56d5e2eb2e24dbba2668cda86999e63465, data reload: false

query1	1058	436	405	405
query2	6548	1685	1714	1685
query3	6751	230	226	226
query4	27186	23376	23547	23376
query5	4973	651	500	500
query6	365	239	228	228
query7	4650	491	316	316
query8	299	261	245	245
query9	8704	2541	2573	2541
query10	529	338	279	279
query11	15664	15014	14897	14897
query12	178	117	115	115
query13	1677	543	435	435
query14	11134	9423	9411	9411
query15	214	193	176	176
query16	7722	641	519	519
query17	1173	765	620	620
query18	2091	447	371	371
query19	210	211	184	184
query20	146	135	126	126
query21	222	140	121	121
query22	4631	4890	4546	4546
query23	34682	34270	33715	33715
query24	9126	2586	2583	2583
query25	642	559	517	517
query26	1281	288	182	182
query27	2717	577	386	386
query28	4537	2242	2200	2200
query29	839	695	534	534
query30	315	246	210	210
query31	934	1103	787	787
query32	89	75	74	74
query33	598	377	327	327
query34	843	884	532	532
query35	859	909	810	810
query36	1070	1039	951	951
query37	117	113	97	97
query38	3500	3516	3466	3466
query39	1463	1439	1465	1439
query40	218	126	112	112
query41	68	70	60	60
query42	121	110	112	110
query43	503	504	465	465
query44	1336	843	826	826
query45	183	181	171	171
query46	844	996	653	653
query47	1771	1776	1739	1739
query48	395	424	317	317
query49	763	502	408	408
query50	651	710	407	407
query51	3916	3918	3967	3918
query52	109	111	104	104
query53	237	274	201	201
query54	594	585	532	532
query55	91	84	83	83
query56	331	334	337	334
query57	1183	1187	1123	1123
query58	282	284	270	270
query59	2518	2582	2677	2582
query60	349	338	324	324
query61	159	150	154	150
query62	803	716	653	653
query63	235	202	231	202
query64	4445	1134	824	824
query65	4042	3966	3947	3947
query66	1088	431	330	330
query67	15564	15274	15295	15274
query68	9315	945	596	596
query69	474	312	297	297
query70	1365	1365	1369	1365
query71	489	343	344	343
query72	5876	2586	5276	2586
query73	788	762	362	362
query74	8878	9260	8657	8657
query75	4480	3384	2916	2916
query76	4227	1181	777	777
query77	1033	419	350	350
query78	9708	9834	8934	8934
query79	1856	860	588	588
query80	752	550	497	497
query81	498	265	230	230
query82	428	161	129	129
query83	299	274	256	256
query84	303	118	93	93
query85	885	479	503	479
query86	336	316	305	305
query87	3799	3745	3773	3745
query88	3471	2239	2261	2239
query89	389	337	309	309
query90	2059	225	219	219
query91	171	168	133	133
query92	80	70	63	63
query93	1385	1009	656	656
query94	684	439	313	313
query95	400	323	318	318
query96	486	586	290	290
query97	2955	2970	2888	2888
query98	304	213	213	213
query99	1435	1419	1288	1288
Total cold run time: 282144 ms
Total hot run time: 188685 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 30.12 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 54258e56d5e2eb2e24dbba2668cda86999e63465, data reload: false

query1	0.06	0.05	0.05
query2	0.10	0.06	0.06
query3	0.25	0.08	0.08
query4	1.61	0.12	0.12
query5	0.26	0.27	0.25
query6	1.17	0.63	0.64
query7	0.03	0.03	0.03
query8	0.06	0.04	0.04
query9	0.62	0.52	0.52
query10	0.58	0.58	0.56
query11	0.16	0.11	0.11
query12	0.18	0.12	0.12
query13	0.63	0.61	0.62
query14	1.05	1.05	1.02
query15	0.86	0.87	0.84
query16	0.42	0.41	0.40
query17	1.04	1.03	1.04
query18	0.22	0.20	0.19
query19	1.96	1.86	1.84
query20	0.02	0.01	0.01
query21	15.44	0.93	0.56
query22	0.76	1.32	0.68
query23	14.81	1.40	0.64
query24	7.04	1.28	0.56
query25	0.47	0.18	0.08
query26	0.64	0.17	0.13
query27	0.06	0.06	0.06
query28	9.60	1.37	0.91
query29	12.56	3.87	3.26
query30	0.28	0.13	0.12
query31	2.84	0.59	0.38
query32	3.25	0.57	0.47
query33	3.18	3.07	3.04
query34	16.11	5.49	4.92
query35	4.91	4.91	4.92
query36	0.71	0.52	0.50
query37	0.09	0.08	0.07
query38	0.06	0.05	0.05
query39	0.04	0.03	0.04
query40	0.18	0.15	0.15
query41	0.09	0.03	0.03
query42	0.04	0.03	0.04
query43	0.04	0.03	0.03
Total cold run time: 104.48 s
Total hot run time: 30.12 s

@airborne12
Copy link
Member Author

run buildall

@airborne12 airborne12 requested a review from Copilot October 3, 2025 03:32
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR fixes a critical bug in NULL bitmap handling for MATCH OR queries in the inverted index system. The bug was causing incorrect boolean logic evaluation when combining TRUE and NULL values in OR operations.

  • Fixed the InvertedIndexResultBitmap::operator|=() method to properly implement SQL three-valued logic for OR operations
  • Added comprehensive unit tests to verify correct NULL handling in various OR scenarios
  • Added regression tests to ensure MATCH OR queries with NULL values return the correct results

Reviewed Changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File Description
be/src/olap/rowset/segment_v2/inverted_index_reader.h Core fix for NULL bitmap handling in OR operations using proper three-valued logic
be/test/olap/rowset/segment_v2/inverted_index_reader_test.cpp Unit tests covering all cases of OR operations with NULL values
regression-test/suites/inverted_index_p0/test_match_or_null_semantics.groovy Integration tests verifying MATCH OR queries work correctly with NULL values

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

WHERE title MATCH_ALL 'Philosophy' OR content MATCH_ALL 'Disney+ Hotstar'
"""

assertEquals(16, test1[0][0], "MATCH should return 16 rows (15 with title match + 1 with content match)")
Copy link

Copilot AI Oct 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed typo in error message: 'recieve' should be 'receive'.

Copilot uses AI. Check for mistakes.
Comment on lines +3640 to +3645
for (uint32_t i = 0; i < 15; ++i) {
EXPECT_TRUE(bitmap_field1.get_data_bitmap()->contains(i))
<< "Row " << i << " should be TRUE";
EXPECT_FALSE(bitmap_field1.get_null_bitmap()->contains(i))
<< "Row " << i << " should not be NULL";
}
Copy link

Copilot AI Oct 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The magic number 15 is used without explanation. Consider extracting it to a named constant like NUM_MATCHING_ROWS or adding a comment explaining why 15 rows are expected.

Copilot uses AI. Check for mistakes.
@doris-robot
Copy link

ClickBench: Total hot run time: 30.06 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 9d5677a0b31206a2100d2375e52e69274030fb6c, data reload: false

query1	0.06	0.06	0.05
query2	0.09	0.06	0.06
query3	0.25	0.09	0.09
query4	1.61	0.12	0.11
query5	0.28	0.27	0.25
query6	1.18	0.66	0.65
query7	0.03	0.03	0.03
query8	0.06	0.04	0.04
query9	0.61	0.53	0.51
query10	0.58	0.58	0.61
query11	0.16	0.11	0.11
query12	0.16	0.12	0.12
query13	0.63	0.63	0.63
query14	1.05	1.02	1.03
query15	0.88	0.88	0.87
query16	0.40	0.41	0.40
query17	1.07	1.04	1.10
query18	0.22	0.21	0.21
query19	1.95	1.90	1.82
query20	0.01	0.01	0.02
query21	15.44	0.96	0.59
query22	0.77	1.06	0.64
query23	15.04	1.40	0.65
query24	7.65	1.38	0.35
query25	0.28	0.09	0.08
query26	0.65	0.18	0.13
query27	0.07	0.06	0.05
query28	9.40	1.40	0.96
query29	12.55	3.98	3.26
query30	0.28	0.13	0.12
query31	2.82	0.59	0.39
query32	3.25	0.57	0.49
query33	3.02	3.06	3.10
query34	16.19	5.47	4.88
query35	4.93	4.93	4.98
query36	0.68	0.52	0.50
query37	0.10	0.07	0.07
query38	0.07	0.05	0.05
query39	0.04	0.03	0.03
query40	0.18	0.15	0.14
query41	0.08	0.04	0.03
query42	0.04	0.04	0.03
query43	0.05	0.04	0.03
Total cold run time: 104.86 s
Total hot run time: 30.06 s

@doris-robot
Copy link

BE UT Coverage Report

Increment line coverage 100.00% (3/3) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 52.49% (17708/33739)
Line Coverage 37.66% (160748/426868)
Region Coverage 32.14% (122757/381898)
Branch Coverage 33.55% (53851/160525)

@hello-stephen
Copy link
Contributor

BE Regression && UT Coverage Report

Increment line coverage 100.00% (3/3) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 71.12% (23509/33057)
Line Coverage 57.57% (245498/426437)
Region Coverage 52.83% (204316/386726)
Branch Coverage 54.51% (87952/161336)

Copy link
Member

@eldenmoon eldenmoon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Oct 3, 2025
@github-actions
Copy link
Contributor

github-actions bot commented Oct 3, 2025

PR approved by at least one committer and no changes requested.

@github-actions
Copy link
Contributor

github-actions bot commented Oct 3, 2025

PR approved by anyone and no changes requested.

Copy link
Contributor

@zhiqiang-hhhh zhiqiang-hhhh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@airborne12 airborne12 merged commit 34a27a3 into apache:master Oct 4, 2025
32 of 35 checks passed
@airborne12 airborne12 deleted the fix-match branch October 4, 2025 04:30
github-actions bot pushed a commit that referenced this pull request Oct 4, 2025
…56699)

Related PR: #56139

Problem Summary:

This PR fixes a bug in NULL bitmap handling for MATCH OR queries in
inverted index query. The bug was causing incorrect boolean logic
evaluation when combining TRUE and NULL values in OR operations.
yiguolei pushed a commit that referenced this pull request Oct 4, 2025
…R queries #56699 (#56702)

Cherry-picked from #56699

Co-authored-by: Jack <jiangkai@selectdb.com>
dwdwqfwe pushed a commit to dwdwqfwe/doris that referenced this pull request Oct 4, 2025
…pache#56699)

Related PR: apache#56139

Problem Summary:

This PR fixes a bug in NULL bitmap handling for MATCH OR queries in
inverted index query. The bug was causing incorrect boolean logic
evaluation when combining TRUE and NULL values in OR operations.
airborne12 added a commit to airborne12/apache-doris that referenced this pull request Jan 7, 2026
…pache#56699)

Related PR: apache#56139

Problem Summary:

This PR fixes a bug in NULL bitmap handling for MATCH OR queries in
inverted index query. The bug was causing incorrect boolean logic
evaluation when combining TRUE and NULL values in OR operations.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by one committer. dev/4.0.0-merged reviewed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants