[feature](inverted index) introduce search function for inverted index #56139

airborne12 · 2025-09-17T06:43:35Z

What problem does this PR solve?

Issue Number: close #56682

Related PR: #xxx

Problem Summary:

Release note

None

Check List (For Author)

Test
- Regression test
- Unit Test
- Manual test (add detailed scripts or steps below)
- No need to test or manual test. Explain why:
  - This is a refactor/code format and no logic has been changed.
  - Previous test can cover this change.
  - No code files have been changed.
  - Other reason
Behavior changed:
- No.
- Yes.
Does this need documentation?
- No.
- Yes.

Check List (For Reviewer who merge this PR)

Confirm the release note
Confirm test cases
Confirm document
Add branch pick label

hello-stephen · 2025-09-17T06:43:40Z

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

What problem was fixed (it's best to include specific error reporting information). How it was fixed.
Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
What features were added. Why was this function added?
Which code was refactored and why was this part of the code refactored?
Which functions were optimized and what is the difference before and after the optimization?

airborne12 · 2025-09-17T06:46:41Z

run buildall

hello-stephen · 2025-09-17T07:03:43Z

Cloud UT Coverage Report

Increment line coverage 🎉

Increment coverage report
Complete coverage report

Category	Coverage
Function Coverage	84.02% (1541/1834)
Line Coverage	68.05% (27596/40553)
Region Coverage	68.44% (13573/19832)
Branch Coverage	58.71% (7249/12348)

doris-robot · 2025-09-17T08:09:18Z

TPC-H: Total hot run time: 35579 ms

machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 689a9ddcc64de71dfeca0a6ec59a746bd72fb269, data reload: false

------ Round 1 ----------------------------------
q1	17601	5229	5109	5109
q2	2025	332	218	218
q3	10244	1329	731	731
q4	10221	1040	534	534
q5	7534	2481	2369	2369
q6	186	173	140	140
q7	934	775	668	668
q8	9349	1323	1146	1146
q9	7007	5148	5129	5129
q10	6912	2441	1976	1976
q11	473	305	287	287
q12	362	374	229	229
q13	17771	3645	3037	3037
q14	234	240	214	214
q15	549	506	493	493
q16	1006	1001	950	950
q17	616	876	370	370
q18	7600	7149	7115	7115
q19	905	964	576	576
q20	340	349	245	245
q21	3626	3192	3045	3045
q22	1106	1065	998	998
Total cold run time: 106601 ms
Total hot run time: 35579 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5134	5128	5130	5128
q2	246	340	231	231
q3	2252	2664	2348	2348
q4	1368	1760	1362	1362
q5	4213	4606	4637	4606
q6	225	178	135	135
q7	2024	2043	1817	1817
q8	2652	2636	2567	2567
q9	7473	7386	7494	7386
q10	3062	3318	2911	2911
q11	617	539	537	537
q12	696	818	681	681
q13	3503	3914	3334	3334
q14	302	297	289	289
q15	551	491	492	491
q16	1076	1118	1061	1061
q17	1241	1592	1336	1336
q18	7848	7638	7670	7638
q19	841	826	844	826
q20	2054	2044	2006	2006
q21	4781	4347	4274	4274
q22	1098	1025	983	983
Total cold run time: 53257 ms
Total hot run time: 51947 ms

doris-robot · 2025-09-17T08:21:03Z

TPC-DS: Total hot run time: 188423 ms

machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 689a9ddcc64de71dfeca0a6ec59a746bd72fb269, data reload: false

query1	1087	417	412	412
query2	6567	1700	1684	1684
query3	6763	225	225	225
query4	26560	23669	23233	23233
query5	4925	636	498	498
query6	333	245	241	241
query7	4670	518	296	296
query8	321	258	240	240
query9	8649	2612	2629	2612
query10	528	338	274	274
query11	15389	14994	14780	14780
query12	172	120	112	112
query13	1663	552	434	434
query14	10802	9197	9177	9177
query15	209	194	173	173
query16	7761	693	506	506
query17	1172	732	612	612
query18	2046	432	331	331
query19	221	202	172	172
query20	133	129	128	128
query21	215	149	119	119
query22	4066	4293	4257	4257
query23	33778	33044	33038	33038
query24	8520	2465	2437	2437
query25	605	546	457	457
query26	1240	283	166	166
query27	2730	511	368	368
query28	4373	2260	2235	2235
query29	801	618	505	505
query30	296	224	206	206
query31	935	824	753	753
query32	87	77	73	73
query33	650	375	329	329
query34	812	878	531	531
query35	821	830	746	746
query36	988	1014	929	929
query37	122	108	86	86
query38	3510	3508	3516	3508
query39	1526	1417	1426	1417
query40	219	132	128	128
query41	71	113	61	61
query42	131	117	112	112
query43	517	514	471	471
query44	1391	849	859	849
query45	185	176	173	173
query46	895	1038	665	665
query47	1784	1815	1762	1762
query48	398	424	325	325
query49	777	526	406	406
query50	683	679	401	401
query51	3880	3959	3886	3886
query52	115	113	105	105
query53	249	277	211	211
query54	601	593	523	523
query55	95	86	83	83
query56	307	327	296	296
query57	1241	1190	1117	1117
query58	284	268	269	268
query59	2547	2724	2553	2553
query60	351	349	342	342
query61	162	158	188	158
query62	824	756	672	672
query63	234	194	194	194
query64	4411	1109	800	800
query65	4053	4027	3992	3992
query66	1084	439	335	335
query67	15587	15143	14919	14919
query68	9427	946	591	591
query69	491	313	266	266
query70	1443	1249	1323	1249
query71	574	348	316	316
query72	6013	4978	5033	4978
query73	714	612	375	375
query74	8929	9153	8762	8762
query75	4458	3319	2835	2835
query76	4907	1184	742	742
query77	1012	405	327	327
query78	9541	9812	8819	8819
query79	1389	814	601	601
query80	694	559	501	501
query81	486	264	226	226
query82	240	160	137	137
query83	297	261	256	256
query84	306	115	92	92
query85	847	475	495	475
query86	340	313	311	311
query87	3793	3681	3612	3612
query88	2887	2248	2236	2236
query89	411	337	291	291
query90	2080	222	214	214
query91	166	168	133	133
query92	83	70	63	63
query93	1224	1113	642	642
query94	690	434	322	322
query95	399	315	299	299
query96	492	569	274	274
query97	2918	2976	2874	2874
query98	239	206	216	206
query99	1426	1413	1291	1291
Total cold run time: 278038 ms
Total hot run time: 188423 ms

doris-robot · 2025-09-17T08:26:33Z

ClickBench: Total hot run time: 29.6 s

machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 689a9ddcc64de71dfeca0a6ec59a746bd72fb269, data reload: false

query1	0.05	0.05	0.06
query2	0.09	0.05	0.06
query3	0.25	0.08	0.08
query4	1.60	0.12	0.12
query5	0.28	0.27	0.25
query6	1.16	0.65	0.66
query7	0.03	0.03	0.03
query8	0.06	0.04	0.04
query9	0.64	0.53	0.52
query10	0.58	0.59	0.60
query11	0.17	0.12	0.11
query12	0.15	0.11	0.12
query13	0.62	0.63	0.61
query14	1.02	1.02	1.04
query15	0.88	0.86	0.85
query16	0.41	0.42	0.40
query17	1.07	1.06	1.06
query18	0.21	0.20	0.20
query19	2.01	1.86	1.83
query20	0.02	0.01	0.01
query21	15.46	0.96	0.58
query22	0.77	1.18	0.66
query23	14.90	1.39	0.67
query24	6.77	1.88	0.44
query25	0.50	0.19	0.07
query26	0.61	0.18	0.14
query27	0.06	0.06	0.06
query28	9.03	0.93	0.45
query29	12.57	3.96	3.22
query30	0.29	0.14	0.11
query31	2.83	0.62	0.39
query32	3.25	0.57	0.48
query33	3.09	3.11	3.14
query34	16.14	5.50	4.84
query35	4.94	4.94	4.91
query36	0.71	0.52	0.51
query37	0.11	0.08	0.08
query38	0.07	0.06	0.04
query39	0.04	0.02	0.03
query40	0.18	0.16	0.15
query41	0.10	0.03	0.04
query42	0.04	0.03	0.03
query43	0.05	0.03	0.03
Total cold run time: 103.81 s
Total hot run time: 29.6 s

airborne12 · 2025-09-17T08:44:14Z

run buildall

Copilot

Pull Request Overview

This PR introduces a new search function for inverted index functionality, providing a DSL-based search interface for full-text search capabilities. The implementation adds comprehensive support from FE parsing through to BE execution.

Key changes:

Introduces ANTLR-based DSL parser for structured search queries with support for various clause types (TERM, PHRASE, PREFIX, WILDCARD, etc.)
Adds new expression types and rewrite rules to handle search function translation from scalar function to slot-based expressions
Implements BE search evaluation with inverted index integration supporting compound boolean queries

Reviewed Changes

Copilot reviewed 24 out of 24 changed files in this pull request and generated 4 comments.

Show a summary per file

File	Description
gensrc/thrift/Exprs.thrift	Adds thrift structures for search DSL communication between FE and BE
fe/fe-core/src/main/antlr4/org/apache/doris/nereids/search/	ANTLR grammar files for parsing search DSL syntax
fe/fe-core/src/main/java/.../SearchDslParser.java	DSL parser implementation with AST building and field binding extraction
fe/fe-core/src/main/java/.../Search.java	Scalar function implementation for search DSL
fe/fe-core/src/main/java/.../SearchExpression.java	Expression type for translated search with slot references
fe/fe-core/src/main/java/.../RewriteSearchToSlots.java	Rewrite rule to convert search functions to slot-based expressions
fe/fe-core/src/main/java/.../SearchPredicate.java	Analysis layer predicate for FE-to-BE translation
be/src/vec/functions/function_search.*	BE function implementation with inverted index evaluation
be/src/vec/exprs/vsearch.*	BE expression evaluation for search operations

_{Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.}

...c/main/java/org/apache/doris/nereids/trees/expressions/functions/scalar/SearchDslParser.java

Copilot · 2025-09-17T08:46:48Z

...c/main/java/org/apache/doris/nereids/trees/expressions/functions/scalar/SearchDslParser.java

+        private String getCurrentFieldName() {
+            // This is a simplified approach - in a real implementation,
+            // we'd need to track context properly
+            return fieldNames.isEmpty() ? "_all" : fieldNames.iterator().next();
+        }


The getCurrentFieldName method returns arbitrary field name from a Set which has no guaranteed ordering. This could lead to inconsistent field name resolution. Consider maintaining a proper context stack or field name resolution mechanism.

Copilot · 2025-09-17T08:46:48Z

be/src/vec/functions/function_search.cpp

+
+            // Open directory directly
+            auto directory = DORIS_TRY(
+                    index_file_reader->open(&inverted_reader->get_index_meta(), context->io_ctx));


The DORIS_TRY macro will return from the function on error, but the subsequent code assumes directory is always valid. This could lead to accessing a null pointer if the macro doesn't behave as expected.

Suggested change

index_file_reader->open(&inverted_reader->get_index_meta(), context->io_ctx));

index_file_reader->open(&inverted_reader->get_index_meta(), context->io_ctx));

if (!directory) {

LOG(WARNING) << "search: Failed to open directory for field: " << field_name;

continue;

}

Copilot · 2025-09-17T08:46:49Z

be/src/olap/rowset/segment_v2/inverted_index_iterator.cpp

+    DBUG_EXECUTE_IF("inverted_index_reader.select_best_reader", {
        auto type = DebugPoints::instance()->get_debug_param_or_default<int32_t>(
-                "inverted_index_reader._select_best_reader", "type", -1);
+                "inverted_index_reader.select_best_reader", "type", -1);


The debug point name has been updated from _select_best_reader to select_best_reader but the string literal still uses the old name with underscore prefix. This inconsistency could cause debug functionality to not work properly.

doris-robot · 2025-09-17T09:09:08Z

Cloud UT Coverage Report

Increment line coverage 🎉

Increment coverage report
Complete coverage report

Category	Coverage
Function Coverage	84.02% (1541/1834)
Line Coverage	68.04% (27593/40553)
Region Coverage	68.46% (13577/19832)
Branch Coverage	58.71% (7250/12348)

doris-robot · 2025-09-17T09:26:52Z

TPC-H: Total hot run time: 34698 ms

machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 97e23d0e012a4c27b385fff80dc0871f9be8b6e4, data reload: false

------ Round 1 ----------------------------------
q1	16471	5154	5030	5030
q2	2010	322	213	213
q3	9749	1306	726	726
q4	9772	1024	524	524
q5	7464	2304	2468	2304
q6	191	176	140	140
q7	918	801	637	637
q8	9154	1349	1123	1123
q9	7087	5157	5145	5145
q10	6926	2386	1986	1986
q11	497	306	297	297
q12	355	374	239	239
q13	17590	3702	3037	3037
q14	245	249	226	226
q15	552	499	495	495
q16	1019	1012	962	962
q17	591	872	377	377
q18	7496	7212	7035	7035
q19	1191	946	577	577
q20	343	355	249	249
q21	3807	2582	2376	2376
q22	1071	1053	1000	1000
Total cold run time: 104499 ms
Total hot run time: 34698 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5513	5115	5081	5081
q2	253	338	230	230
q3	2212	2671	2310	2310
q4	1347	1779	1337	1337
q5	4186	4254	4617	4254
q6	213	175	135	135
q7	2091	1977	1862	1862
q8	2652	2529	2519	2519
q9	7481	7379	7352	7352
q10	3077	3392	2900	2900
q11	584	519	541	519
q12	701	788	651	651
q13	3507	4019	3300	3300
q14	295	321	296	296
q15	525	495	496	495
q16	1095	1135	1089	1089
q17	1196	1544	1398	1398
q18	7895	7684	7516	7516
q19	824	869	951	869
q20	2039	2096	1913	1913
q21	5144	4385	4408	4385
q22	1069	1048	1022	1022
Total cold run time: 53899 ms
Total hot run time: 51433 ms

doris-robot · 2025-09-17T09:38:48Z

TPC-DS: Total hot run time: 188419 ms

machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 97e23d0e012a4c27b385fff80dc0871f9be8b6e4, data reload: false

query1	1098	444	418	418
query2	6560	1727	1647	1647
query3	6756	219	226	219
query4	26602	23773	22960	22960
query5	4401	646	459	459
query6	331	230	235	230
query7	4654	514	293	293
query8	297	255	232	232
query9	8688	2631	2660	2631
query10	486	343	292	292
query11	15716	15009	14719	14719
query12	170	115	117	115
query13	1672	574	421	421
query14	10628	9212	9283	9212
query15	209	198	185	185
query16	7544	662	536	536
query17	1269	782	662	662
query18	2058	435	347	347
query19	210	203	189	189
query20	131	139	136	136
query21	222	136	118	118
query22	4224	4312	4162	4162
query23	33839	33163	32907	32907
query24	8247	2468	2454	2454
query25	624	555	521	521
query26	1233	280	161	161
query27	2733	511	367	367
query28	4196	2257	2231	2231
query29	805	606	479	479
query30	299	228	205	205
query31	970	811	738	738
query32	92	76	72	72
query33	579	379	339	339
query34	794	869	522	522
query35	810	854	752	752
query36	979	1002	930	930
query37	121	107	82	82
query38	3495	3559	3560	3559
query39	1501	1416	1439	1416
query40	216	126	119	119
query41	69	63	65	63
query42	125	114	112	112
query43	517	520	476	476
query44	1352	872	871	871
query45	184	176	171	171
query46	872	1023	649	649
query47	1792	1832	1729	1729
query48	399	421	310	310
query49	770	519	397	397
query50	669	706	413	413
query51	3885	3978	3877	3877
query52	120	112	106	106
query53	245	272	212	212
query54	592	582	516	516
query55	93	83	85	83
query56	340	318	305	305
query57	1237	1206	1126	1126
query58	283	272	261	261
query59	2551	2690	2590	2590
query60	345	380	334	334
query61	163	159	161	159
query62	847	724	686	686
query63	237	201	191	191
query64	4238	1159	843	843
query65	4049	4008	3960	3960
query66	1098	442	344	344
query67	15550	15268	14984	14984
query68	9041	948	582	582
query69	499	315	292	292
query70	1345	1314	1279	1279
query71	583	363	325	325
query72	6101	5027	5231	5027
query73	771	644	367	367
query74	8919	9206	8700	8700
query75	4403	3373	2860	2860
query76	3795	1186	777	777
query77	897	407	326	326
query78	9625	9713	8908	8908
query79	2404	816	615	615
query80	660	576	505	505
query81	509	264	232	232
query82	387	155	138	138
query83	302	272	255	255
query84	310	115	99	99
query85	884	476	448	448
query86	382	311	289	289
query87	3742	3715	3596	3596
query88	3517	2278	2255	2255
query89	417	334	308	308
query90	1843	229	230	229
query91	167	175	142	142
query92	82	73	64	64
query93	1986	1027	639	639
query94	683	439	350	350
query95	454	320	314	314
query96	498	581	286	286
query97	2969	3015	2895	2895
query98	241	220	209	209
query99	1460	1454	1303	1303
Total cold run time: 277957 ms
Total hot run time: 188419 ms

doris-robot · 2025-09-17T09:44:19Z

ClickBench: Total hot run time: 29.98 s

machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 97e23d0e012a4c27b385fff80dc0871f9be8b6e4, data reload: false

query1	0.06	0.05	0.05
query2	0.10	0.06	0.05
query3	0.25	0.09	0.09
query4	1.60	0.12	0.12
query5	0.30	0.26	0.25
query6	1.16	0.66	0.66
query7	0.03	0.03	0.02
query8	0.06	0.04	0.04
query9	0.61	0.53	0.52
query10	0.58	0.56	0.56
query11	0.16	0.12	0.11
query12	0.18	0.12	0.12
query13	0.63	0.62	0.63
query14	1.01	1.04	1.03
query15	0.88	0.87	0.86
query16	0.41	0.40	0.40
query17	1.05	1.06	1.05
query18	0.22	0.20	0.20
query19	1.95	1.88	1.88
query20	0.01	0.01	0.01
query21	15.47	0.98	0.58
query22	0.78	1.14	0.68
query23	15.00	1.39	0.70
query24	7.44	1.51	0.60
query25	0.50	0.13	0.13
query26	0.61	0.16	0.15
query27	0.07	0.05	0.06
query28	9.64	0.90	0.47
query29	12.56	3.93	3.28
query30	0.28	0.14	0.14
query31	2.84	0.59	0.40
query32	3.25	0.59	0.48
query33	3.08	3.04	3.17
query34	16.11	5.48	4.84
query35	4.94	4.92	4.91
query36	0.70	0.51	0.50
query37	0.11	0.08	0.07
query38	0.07	0.04	0.05
query39	0.04	0.03	0.03
query40	0.16	0.15	0.16
query41	0.09	0.04	0.03
query42	0.04	0.03	0.03
query43	0.04	0.04	0.04
Total cold run time: 105.07 s
Total hot run time: 29.98 s

hello-stephen · 2025-09-17T10:06:32Z

BE UT Coverage Report

Increment line coverage 2.16% (10/464) 🎉

Increment coverage report
Complete coverage report

Category	Coverage
Function Coverage	52.29% (17550/33563)
Line Coverage	37.42% (159048/425039)
Region Coverage	31.99% (121407/379516)
Branch Coverage	33.34% (53202/159589)

zzzxl1993 · 2025-09-18T06:33:54Z

run buildall

hello-stephen · 2025-09-18T07:02:02Z

Cloud UT Coverage Report

Increment line coverage 🎉

Increment coverage report
Complete coverage report

Category	Coverage
Function Coverage	84.02% (1541/1834)
Line Coverage	68.01% (27579/40553)
Region Coverage	68.43% (13573/19835)
Branch Coverage	58.64% (7242/12350)

airborne12 · 2025-09-21T16:15:11Z

run buildall

hello-stephen · 2025-09-21T16:45:34Z

Cloud UT Coverage Report

Increment line coverage 🎉

Increment coverage report
Complete coverage report

Category	Coverage
Function Coverage	84.02% (1541/1834)
Line Coverage	68.00% (27616/40610)
Region Coverage	68.37% (13583/19868)
Branch Coverage	58.67% (7253/12362)

hello-stephen · 2025-09-21T21:00:33Z

FE Regression Coverage Report

Increment line coverage 44.42% (179/403) 🎉
Increment coverage report
Complete coverage report

airborne12 · 2025-09-23T13:59:04Z

run buildall

hello-stephen · 2025-09-23T15:05:05Z

FE UT Coverage Report

Increment line coverage 66.75% (281/421) 🎉
Increment coverage report
Complete coverage report

doris-robot · 2025-09-23T15:55:30Z

ClickBench: Total hot run time: 30.13 s

machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 1124b83a40b68848e4b44a30717ad530fdbaa3c7, data reload: false

query1	0.05	0.04	0.05
query2	0.10	0.06	0.05
query3	0.25	0.08	0.08
query4	1.60	0.12	0.12
query5	0.29	0.27	0.26
query6	1.18	0.66	0.64
query7	0.04	0.03	0.03
query8	0.06	0.04	0.04
query9	0.63	0.53	0.53
query10	0.61	0.58	0.57
query11	0.16	0.11	0.12
query12	0.15	0.11	0.12
query13	0.64	0.63	0.62
query14	1.05	1.02	1.03
query15	0.89	0.86	0.86
query16	0.41	0.40	0.40
query17	1.03	1.03	1.04
query18	0.22	0.20	0.21
query19	2.28	1.98	2.03
query20	0.02	0.01	0.02
query21	15.65	0.94	0.58
query22	0.76	1.18	0.67
query23	14.92	1.32	0.65
query24	7.64	0.99	0.32
query25	0.36	0.23	0.13
query26	0.66	0.16	0.14
query27	0.07	0.06	0.05
query28	9.68	1.35	0.94
query29	12.63	3.87	3.26
query30	0.28	0.13	0.13
query31	2.83	0.62	0.39
query32	3.25	0.57	0.47
query33	3.03	3.13	3.08
query34	16.07	5.47	4.81
query35	4.95	4.93	4.93
query36	0.71	0.51	0.52
query37	0.10	0.07	0.08
query38	0.06	0.05	0.04
query39	0.03	0.03	0.04
query40	0.18	0.17	0.14
query41	0.09	0.04	0.03
query42	0.04	0.03	0.03
query43	0.05	0.03	0.03
Total cold run time: 105.7 s
Total hot run time: 30.13 s

hello-stephen · 2025-09-23T18:24:55Z

FE Regression Coverage Report

Increment line coverage 66.75% (281/421) 🎉
Increment coverage report
Complete coverage report

doris-robot · 2025-09-24T01:30:52Z

Cloud UT Coverage Report

Increment line coverage 🎉

Increment coverage report
Complete coverage report

Category	Coverage
Function Coverage	83.80% (1541/1839)
Line Coverage	67.84% (27624/40719)
Region Coverage	68.20% (13591/19929)
Branch Coverage	58.47% (7252/12402)

airborne12 · 2025-09-24T09:55:58Z

run buildall

doris-robot · 2025-09-24T10:14:23Z

Cloud UT Coverage Report

Increment line coverage 🎉

Increment coverage report
Complete coverage report

Category	Coverage
Function Coverage	83.75% (1567/1871)
Line Coverage	67.90% (27903/41096)
Region Coverage	68.22% (13723/20117)
Branch Coverage	58.50% (7320/12512)

hello-stephen · 2025-09-24T10:56:07Z

FE UT Coverage Report

Increment line coverage 66.75% (281/421) 🎉
Increment coverage report
Complete coverage report

github-actions · 2025-09-30T22:11:53Z

PR approved by at least one committer and no changes requested.

#56139)

…nverted index #56139 (#56693) Cherry-picked from #56139 Co-authored-by: Jack <jiangkai@selectdb.com>

…56699) Related PR: #56139 Problem Summary: This PR fixes a bug in NULL bitmap handling for MATCH OR queries in inverted index query. The bug was causing incorrect boolean logic evaluation when combining TRUE and NULL values in OR operations.

apache#56139)

…pache#56699) Related PR: apache#56139 Problem Summary: This PR fixes a bug in NULL bitmap handling for MATCH OR queries in inverted index query. The bug was causing incorrect boolean logic evaluation when combining TRUE and NULL values in OR operations.

### What problem does this PR solve? Issue Number: close #xxx Related PR: #56139 Problem Summary: This PR adds restrictions for the search() function to ensure it can only be used in WHERE clauses on single-table OLAP scans. The implementation includes validation rules that reject search() usage in other contexts like SELECT projections, GROUP BY clauses, HAVING clauses, and multi-table scenarios.

### What problem does this PR solve? Issue Number: close #xxx Related PR: #56139 Problem Summary: This PR adds EXACT DSL functionality to the search function, enabling exact string matching without tokenization. This feature complements existing ANY/ALL operators that work with tokenized indexes by providing strict string equality matching.

…56718) ### What problem does this PR solve? Issue Number: close #xxx Related PR: #56139 Problem Summary: This PR adds support for variant subcolumn access in search functions, enabling search queries to target specific JSON paths within variant columns using dot notation (e.g., field.subcolumn). The feature extends the search DSL to handle variant data types with subcolumn paths, allowing more granular search capabilities on semi-structured data. ``` SELECT * FROM test_variant_search_subcolumn WHERE search('variantColumn.subcolumn:textMatched'); ```

apache#56139)

…pache#56699) Related PR: apache#56139 Problem Summary: This PR fixes a bug in NULL bitmap handling for MATCH OR queries in inverted index query. The bug was causing incorrect boolean logic evaluation when combining TRUE and NULL values in OR operations.

### What problem does this PR solve? Issue Number: close #xxx Related PR: apache#56139 Problem Summary: This PR adds restrictions for the search() function to ensure it can only be used in WHERE clauses on single-table OLAP scans. The implementation includes validation rules that reject search() usage in other contexts like SELECT projections, GROUP BY clauses, HAVING clauses, and multi-table scenarios.

### What problem does this PR solve? Issue Number: close #xxx Related PR: apache#56139 Problem Summary: This PR adds EXACT DSL functionality to the search function, enabling exact string matching without tokenization. This feature complements existing ANY/ALL operators that work with tokenized indexes by providing strict string equality matching.

…pache#56718) Issue Number: close #xxx Related PR: apache#56139 Problem Summary: This PR adds support for variant subcolumn access in search functions, enabling search queries to target specific JSON paths within variant columns using dot notation (e.g., field.subcolumn). The feature extends the search DSL to handle variant data types with subcolumn paths, allowing more granular search capabilities on semi-structured data. ``` SELECT * FROM test_variant_search_subcolumn WHERE search('variantColumn.subcolumn:textMatched'); ```

airborne12 requested a review from Copilot September 17, 2025 08:44

Copilot AI reviewed Sep 17, 2025

View reviewed changes

airborne12 force-pushed the fix-case branch from 3ec1b4a to f5af2dc Compare September 21, 2025 16:15

airborne12 force-pushed the fix-case branch from cefc7dc to 1124b83 Compare September 23, 2025 13:56

airborne12 force-pushed the fix-case branch from 1124b83 to 9bf6960 Compare September 24, 2025 09:52

yiguolei approved these changes Sep 30, 2025

View reviewed changes

github-actions bot added the approved Indicates a PR has been approved by one committer. label Sep 30, 2025

yiguolei merged commit 9584ce3 into apache:master Sep 30, 2025
25 of 26 checks passed

github-actions bot pushed a commit that referenced this pull request Sep 30, 2025

[feature](inverted index) introduce search function for inverted index (

2156567

#56139)

github-actions bot mentioned this pull request Sep 30, 2025

branch-4.0: [feature](inverted index) introduce search function for inverted index #56139 #56693

Merged

airborne12 deleted the fix-case branch September 30, 2025 23:27

yiguolei pushed a commit that referenced this pull request Oct 1, 2025

branch-4.0: [feature](inverted index) introduce search function for i…

6620e62

…nverted index #56139 (#56693) Cherry-picked from #56139 Co-authored-by: Jack <jiangkai@selectdb.com>

yiguolei added dev/4.0.0-merged and removed dev/4.0.x labels Oct 1, 2025

airborne12 mentioned this pull request Oct 4, 2025

[fix](inverted index) Fix NULL bitmap handling in MATCH OR queries #56699

Merged

16 tasks

dwdwqfwe pushed a commit to dwdwqfwe/doris that referenced this pull request Oct 4, 2025

[feature](inverted index) introduce search function for inverted index (

b8e2f9f

apache#56139)

airborne12 mentioned this pull request Oct 5, 2025

[fix](search) add restriction for search function #56706

Merged

16 tasks

airborne12 mentioned this pull request Oct 5, 2025

[feature](search) add exact dsl for search function #56710

Merged

16 tasks

airborne12 mentioned this pull request Oct 14, 2025

[fix](inverted index) fix is null predicate for inverted index evaluate #56964

Merged

16 tasks

airborne12 mentioned this pull request Oct 16, 2025

[feature](search) add variant subcolumn suppport for search function #56718

Merged

16 tasks

airborne12 added a commit to airborne12/apache-doris that referenced this pull request Jan 7, 2026

[feature](inverted index) introduce search function for inverted index (

2a296cb

apache#56139)

-                    index_file_reader->open(&inverted_reader->get_index_meta(), context->io_ctx));
+                    index_file_reader->open(&inverted_reader->get_index_meta(), context->io_ctx));
+            if (!directory) {
+                LOG(WARNING) << "search: Failed to open directory for field: " << field_name;
+                continue;
+            }

[feature](inverted index) introduce search function for inverted index #56139

[feature](inverted index) introduce search function for inverted index #56139

Uh oh!

Conversation

airborne12 commented Sep 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What problem does this PR solve?

Release note

Check List (For Author)

Check List (For Reviewer who merge this PR)

Uh oh!

hello-stephen commented Sep 17, 2025

Uh oh!

airborne12 commented Sep 17, 2025

Uh oh!

hello-stephen commented Sep 17, 2025

Cloud UT Coverage Report

Uh oh!

doris-robot commented Sep 17, 2025

Uh oh!

doris-robot commented Sep 17, 2025

Uh oh!

doris-robot commented Sep 17, 2025

Uh oh!

airborne12 commented Sep 17, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Uh oh!

Copilot AI Sep 17, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Sep 17, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Sep 17, 2025

Choose a reason for hiding this comment

Uh oh!

doris-robot commented Sep 17, 2025

Cloud UT Coverage Report

Uh oh!

doris-robot commented Sep 17, 2025

Uh oh!

doris-robot commented Sep 17, 2025

Uh oh!

doris-robot commented Sep 17, 2025

Uh oh!

hello-stephen commented Sep 17, 2025

BE UT Coverage Report

Uh oh!

zzzxl1993 commented Sep 18, 2025

Uh oh!

hello-stephen commented Sep 18, 2025

Cloud UT Coverage Report

Uh oh!

airborne12 commented Sep 21, 2025

Uh oh!

hello-stephen commented Sep 21, 2025

Cloud UT Coverage Report

Uh oh!

hello-stephen commented Sep 21, 2025

FE Regression Coverage Report

Uh oh!

airborne12 commented Sep 23, 2025

Uh oh!

hello-stephen commented Sep 23, 2025

FE UT Coverage Report

Uh oh!

doris-robot commented Sep 23, 2025

Uh oh!

hello-stephen commented Sep 23, 2025

FE Regression Coverage Report

Uh oh!

doris-robot commented Sep 24, 2025

Cloud UT Coverage Report

Uh oh!

airborne12 commented Sep 24, 2025

airborne12 commented Sep 17, 2025 •

edited

Loading