Skip to content

Search command revamp.#4152

Merged
penghuo merged 1 commit intoopensearch-project:mainfrom
vamsimanohar:free_text_search
Sep 15, 2025
Merged

Search command revamp.#4152
penghuo merged 1 commit intoopensearch-project:mainfrom
vamsimanohar:free_text_search

Conversation

@vamsimanohar
Copy link
Copy Markdown
Member

@vamsimanohar vamsimanohar commented Aug 27, 2025

Description

Note : Although the PR says 70 file changes, most of them are test file changes and PPL query changes in old files. Below are the important files to review.

 Parser & Grammar

  - OpenSearchPPLParser.g4 - Grammar extensions for search expression syntax
  - AstBuilder.java - Constructs Search AST from parse tree
  - AstExpressionBuilder.java - Builds search expression tree

  Search Expression AST Classes

  - Search.java - Enhanced Search AST node with expression support
  - SearchExpression.java - Base interface for search expressions
  - SearchComparison.java - Field comparison operators (=, !=, <, >, <=, >=)
  - SearchIn.java - IN operator implementation
  - Search{And,Or,Not}.java - Boolean operators
  - SearchLiteral.java - Free text search support

  Core Integration Points

  - Analyzer.java - Integrates search expression analysis into query pipeline
  - CalciteRelNodeVisitor.java - Converts Search nodes to relational operations for Calcite execution path

Summary
The search command has been redesigned with well-defined full text search functionality instead of being an extended where clause with source support. It now supports new search
expressions that translate directly to OpenSearch's query_string DSL for optimal performance.

The new boundary for the search command: it is designed exclusively for full-text search, always pushed down as the first operation, and limited to Lucene query syntax support. Any additional filtering should be achieved using the where command.

The revamped search command leverages https://lucene.apache.org/core/2_9_4/queryparsersyntax.html and only supports:

  • Field comparisons: search source=logs status=200 AND method="GET"
  • Boolean operators: AND, OR, NOT with proper precedence handling
  • IN operator: search source=logs status IN (200, 201, 204)
  • Range queries: search source=logs responseTime>100 AND responseTime<=500
  • Free text search: search source=logs "error message"

All search expressions are pushed down to OpenSearch as query_string queries, ensuring efficient data filtering at the storage layer before entering the processing pipeline.

Solution

Key Design Decisions

  1. Separate AST Hierarchy: Introduced dedicated SearchExpression classes distinct from regular Expression classes, maintaining clean separation between search-time filtering and
    query-time evaluation
  2. Direct Query DSL Translation: Search expressions bypass logical/physical planning stages and translate directly to OpenSearch Query DSL in the Analyzer phase for optimal
    performance
  3. Grammar Extension: Extended PPL parser to support rich search syntax while preserving full backward compatibility with existing queries

How It Works

  • Parser constructs a Search node containing a SearchExpression tree from the query
  • Analyzer visits the Search node and transforms expressions directly into OpenSearch query_string queries
  • Queries execute at the OpenSearch level, filtering data efficiently before it enters the processing pipeline

Related Issues

Resolves #4007

Check List

  • New functionality includes testing.
  • New functionality has been documented.
  • New functionality has javadoc added.
  • New functionality has a user manual doc added.
  • New PPL command checklist all confirmed.
  • API changes companion pull request created.
  • Commits are signed per the DCO using --signoff or -s.
  • Public documentation issue/PR created.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@vamsimanohar vamsimanohar changed the title [featue] Initial commitfor search command revamp [feature] search command revamp. Aug 27, 2025
@vamsimanohar vamsimanohar changed the title [feature] search command revamp. Search command revamp. Aug 27, 2025
@vamsimanohar vamsimanohar added the enhancement New feature or request label Aug 27, 2025
@vamsimanohar vamsimanohar self-assigned this Aug 27, 2025
@vamsimanohar vamsimanohar force-pushed the free_text_search branch 5 times, most recently from f814ea5 to a3ea99a Compare August 28, 2025 20:57
@vamsimanohar vamsimanohar marked this pull request as ready for review September 2, 2025 15:42
Comment on lines +136 to +137
| SQUOTA_STRING // Single quotes searching for quotes
| BQUOTA_STRING // Backticks will also be searched.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"search source=log00001 "a'b""

{ "index": { "_id": 3 } }
{"message": "a'b"}
{ "index": { "_id": 4 } }
{"message": "'a'b"}

@vamsimanohar vamsimanohar force-pushed the free_text_search branch 5 times, most recently from a697ed2 to 4c6fd53 Compare September 12, 2025 18:00
penghuo
penghuo previously approved these changes Sep 13, 2025
Signed-off-by: Vamsi Manohar <reddyvam@amazon.com>
@opensearch-trigger-bot
Copy link
Copy Markdown
Contributor

The backport to 2.19-dev failed:

The process '/usr/bin/git' failed with exit code 128

To backport manually, run these commands in your terminal:

# Navigate to the root of your repository
cd $(git rev-parse --show-toplevel)
# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add ../.worktrees/sql/backport-2.19-dev 2.19-dev
# Navigate to the new working tree
pushd ../.worktrees/sql/backport-2.19-dev
# Create a new branch
git switch --create backport/backport-4152-to-2.19-dev
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 bd40af5cca2c1f7ba7675e700f4c82b973ae3317
# Push it to GitHub
git push --set-upstream origin backport/backport-4152-to-2.19-dev
# Go back to the original working tree
popd
# Delete the working tree
git worktree remove ../.worktrees/sql/backport-2.19-dev

Then, create a pull request where the base branch is 2.19-dev and the compare/head branch is backport/backport-4152-to-2.19-dev.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport 2.19-dev backport-failed backport-manually Filed a PR to backport manually. enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[FEATURE] Enhance search command to support free text and wildcards.

5 participants