Skip to content

Fix copy_to functionality for geo_point fields with object/array values#20542

Merged
andrross merged 13 commits intoopensearch-project:mainfrom
krocky-cooky:fix/geopoint-copyto-20540
Mar 13, 2026
Merged

Fix copy_to functionality for geo_point fields with object/array values#20542
andrross merged 13 commits intoopensearch-project:mainfrom
krocky-cooky:fix/geopoint-copyto-20540

Conversation

@krocky-cooky
Copy link
Copy Markdown
Contributor

Description

This PR adds support for the copy_to attribute on geo_point fields when they are represented as complex structures (objects or arrays).

Problem

Previously, when a geo_point field with a copy_to attribute was indexed using object notation (e.g., {"lat": 40.71, "lon": 74.00}) or array notation (e.g., [74.00, 40.71]), the copy operation would fail. This occurred because the XContent parser consumed all tokens during the initial field parsing, leaving no data available for the subsequent copy_to operation. This is a similar issue to k-NN Plugin issue #2536 and is caused by values being defined across multiple tokens.

Solution

This PR introduces a mechanism to preserve the field value for reuse:

  1. Serialize complex values to bytes: When a field value is an object or array, the parser state is serialized to a byte array before parsing begins.
  2. Reparse for copy_to fields: The byte array is used to create fresh parser instances for each copy_to target field, allowing the same data to be parsed multiple times.

Changes

  • Added parseFieldWithCopyTo() method to handle field parsing with copy_to support for both simple and complex value types
  • Added parseChildToBytes() helper method to serialize the current parser state to a byte array
  • Extended parseCopyFields() to accept an optional byte array parameter for reparsing complex structures
  • Added test cases for geo_point fields with copy_to in both object and array formats

Testing

  • Added testGeoPointWithCopyTo() to verify object notation ({"lat": ..., "lon": ...})
  • Added testGeoPointArrayWithCopyTo() to verify array notation ([lon, lat])
  • All existing tests continue to pass (124 tests total in DocumentParserTests)

Related Issues

Resolves #20540

Check List

  • Functionality includes testing.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@github-actions github-actions Bot added bug Something isn't working Indexing Indexing, Bulk Indexing and anything related to indexing labels Feb 4, 2026
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Feb 4, 2026

Important

Review skipped

Auto incremental reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

This PR enables the copy_to mechanism to work with multi-token field values (e.g., geo_point with object format containing separate lat/lon tokens). It introduces internal routing that serializes complex field structures to bytes, re-parses them, and applies copy_to semantics. A new switchParser method in ParseContext allows temporary parser replacement during this process.

Changes

Cohort / File(s) Summary
Core copy_to routing logic
server/src/main/java/org/opensearch/index/mapper/DocumentParser.java
Introduces parseFieldWithCopyTo() and supporting methods (parseCopyFields with byte-handling, parseChildToBytes) to serialize and re-parse multi-token field values before applying copy_to. Replaces direct FieldMapper.parse() calls with new pathway for fields requiring copy semantics.
ParseContext enhancement
server/src/main/java/org/opensearch/index/mapper/ParseContext.java
Adds public switchParser(XContentParser parser) method that returns a filtered ParseContext with an overridden parser, enabling temporary parser replacement without affecting other context behavior.
Test coverage
server/src/test/java/org/opensearch/index/mapper/DocumentParserTests.java
Adds testGeoPointWithCopyTo() and testGeoPointArrayWithCopyTo() test methods to verify copy_to correctly propagates geo_point field data (object and array formats) to target fields.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title 'Fix copy_to functionality for geo_point fields with object/array values' clearly summarizes the main change: enabling copy_to support for geo_point fields when they use object or array formats.
Description check ✅ Passed The description follows the template with all key sections completed: clear problem/solution description, list of changes, testing details, linked issue, and checklist confirmation. Non-critical sections like API changes and documentation are appropriately marked as not applicable.
Linked Issues check ✅ Passed The code changes fully address issue #20540 by implementing parseFieldWithCopyTo(), parseChildToBytes(), and updated parseCopyFields() to handle copy_to for geo_point fields with object/array values, with comprehensive tests for both formats.
Out of Scope Changes check ✅ Passed All changes are directly scoped to the issue: core parsing logic updates in DocumentParser.java, a parser switching utility in ParseContext.java, and targeted test cases cover only the copy_to fix for complex geo_point values.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Tip

Issue Planner is now in beta. Read the docs and try it out! Share your feedback on Discord.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@server/src/main/java/org/opensearch/index/mapper/DocumentParser.java`:
- Around line 1572-1600: The method parseFieldWithCopyTo currently always
serializes START_OBJECT/START_ARRAY via parseChildToBytes even when there are no
copy_to targets; change it to first inspect fieldMapper.copyTo().copyToFields()
and if that collection is empty, avoid buffering and just call
fieldMapper.parse(context) (and parseCopyFields(context, ... ) without the
childBytes path); only call parseChildToBytes, create an innerParser, switch the
context and pass childBytes into parseCopyFields when copyTo targets are
present. Update references: parseFieldWithCopyTo, parseChildToBytes,
parseCopyFields, and the call to fieldMapper.copyTo().copyToFields() so
buffering happens only when needed.
📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 839254d and 2055b1b.

📒 Files selected for processing (3)
  • server/src/main/java/org/opensearch/index/mapper/DocumentParser.java
  • server/src/main/java/org/opensearch/index/mapper/ParseContext.java
  • server/src/test/java/org/opensearch/index/mapper/DocumentParserTests.java
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2026-01-13T17:40:27.167Z
Learnt from: reta
Repo: opensearch-project/OpenSearch PR: 20411
File: server/src/main/java/org/opensearch/index/codec/CodecService.java:112-133
Timestamp: 2026-01-13T17:40:27.167Z
Learning: Avoid capturing or evaluating a supplier (e.g., this::defaultCodec) upfront when passing it to a registry during object construction. If registries may replace defaults during iteration (as in EnginePlugin.getAdditionalCodecs), pass the supplier itself and only resolve it at use time. This ensures dynamic behavior is preserved during initialization and prevents premature binding of defaults in codecs/registry setup. This pattern should apply to similar initialization paths in Java server code where registries may mutate defaults during construction.

Applied to files:

  • server/src/main/java/org/opensearch/index/mapper/ParseContext.java
  • server/src/main/java/org/opensearch/index/mapper/DocumentParser.java
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: gradle-check
🔇 Additional comments (6)
server/src/main/java/org/opensearch/index/mapper/ParseContext.java (1)

703-711: LGTM — scoped parser override is clean and consistent with other context wrappers.

server/src/main/java/org/opensearch/index/mapper/DocumentParser.java (4)

768-773: LGTM: field parsing is consistently routed through the new copy_to helper.


848-856: LGTM: disable_objects path now honors copy_to parsing.


1343-1377: LGTM: byte-backed reparse keeps copy_to isolated and deterministic.


1381-1402: LGTM: copy_to context switching remains intact for non-byte paths.

server/src/test/java/org/opensearch/index/mapper/DocumentParserTests.java (1)

3509-3558: Great coverage for geo_point copy_to across object and array forms.

✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Feb 4, 2026

❌ Gradle check result for 2055b1b: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Feb 4, 2026

❌ Gradle check result for 9616e61: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Feb 4, 2026

❌ Gradle check result for 14a03b4: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@github-actions
Copy link
Copy Markdown
Contributor

❌ Gradle check result for 348bb0a: null

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@hasnain2808
Copy link
Copy Markdown
Contributor

hasnain2808 commented Mar 4, 2026

Hi @krocky-cooky
I was trying to debug opensearch-project/k-NN#2636 when I found this PR
are you actively working on this PR/ issue?

@krocky-cooky
Copy link
Copy Markdown
Contributor Author

Thanks for the comment.
This PR doesn't include test cases for vector_field, but merging it should also resolve issue opensearch-project/k-NN#2636. It's currently awaiting review from the maintainers.
Did you have a different implementation plan in mind?

It looks like the issue was recently triaged.
@andrross Could you take a look at this when you get a chance? Happy to make any changes as needed.

Copy link
Copy Markdown
Member

@andrross andrross left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like you have a spotless formatting error. Can you fix that and rebase from main?

Comment thread server/src/main/java/org/opensearch/index/mapper/DocumentParser.java Outdated
This change enables copy_to functionality for geo_point fields when
they are represented as objects or arrays. Previously, copy_to would
fail for geo_point fields because the parser consumed tokens during
the initial parse, leaving nothing for the copy_to operation.

The fix serializes complex field values (objects/arrays) to bytes
before parsing, allowing the same data to be parsed multiple times
for both the original field and copy_to targets.

Changes:
- Add parseFieldWithCopyTo() method to handle field parsing with
  copy_to support for both simple and complex value types
- Add parseChildToBytes() helper to serialize current parser state
- Extend parseCopyFields() to accept byte array for reparsing
- Add test case for geo_point with copy_to attribute

Signed-off-by: krocky-cooky <kurotaku9679.sub@gmail.com>
Signed-off-by: krocky-cooky <kurotaku9679.sub@gmail.com>
Signed-off-by: krocky-cooky <kurotaku9679.sub@gmail.com>
Signed-off-by: krocky-cooky <kurotaku9679.sub@gmail.com>
Signed-off-by: krocky-cooky <kurotaku9679.sub@gmail.com>
Signed-off-by: Takuo Kuroki <kurotaku9679.sub@gmail.com>
Signed-off-by: Takuo Kuroki <kurotaku9679.sub@gmail.com>
Extract common copy-to field resolution logic from two parseCopyFields
overloads into a shared parseCopyFieldsInternal method that accepts a
CheckedBiConsumer for the copy action.

Also fix test method name typo: testGetPointArrayWithMultipleCopyTo
-> testGeoPointArrayWithMultipleCopyTo.

Signed-off-by: Takuo Kuroki <kurotaku9679.sub@gmail.com>
Signed-off-by: Takuo Kuroki <kurotaku9679.sub@gmail.com>
clarify parser lifecycle and avoid reusing context with a closed parser.

Signed-off-by: Takuo Kuroki <kurotaku9679.sub@gmail.com>
- use try-with-resource with XcContentBuilder
- fix brace style
- use isEmpty() method

Signed-off-by: Takuo Kuroki <kurotaku9679.sub@gmail.com>
Signed-off-by: Takuo Kuroki <kurotaku9679.sub@gmail.com>
Signed-off-by: Takuo Kuroki <kurotaku9679.sub@gmail.com>
@krocky-cooky krocky-cooky force-pushed the fix/geopoint-copyto-20540 branch from fb72ba1 to 63c3ea4 Compare March 12, 2026 09:06
@github-actions
Copy link
Copy Markdown
Contributor

Persistent review updated to latest commit 63c3ea4

@github-actions
Copy link
Copy Markdown
Contributor

❌ Gradle check result for 63c3ea4: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@github-actions
Copy link
Copy Markdown
Contributor

❌ Gradle check result for 63c3ea4: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@github-actions
Copy link
Copy Markdown
Contributor

❌ Gradle check result for 63c3ea4: null

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@github-actions
Copy link
Copy Markdown
Contributor

✅ Gradle check result for 63c3ea4: SUCCESS

@codecov
Copy link
Copy Markdown

codecov Bot commented Mar 12, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 73.20%. Comparing base (db61a88) to head (63c3ea4).
⚠️ Report is 10 commits behind head on main.

Additional details and impacted files
@@             Coverage Diff              @@
##               main   #20542      +/-   ##
============================================
- Coverage     73.30%   73.20%   -0.11%     
+ Complexity    72252    72152     -100     
============================================
  Files          5795     5795              
  Lines        330056   330088      +32     
  Branches      47643    47645       +2     
============================================
- Hits         241947   241632     -315     
- Misses        68695    69063     +368     
+ Partials      19414    19393      -21     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@andrross andrross merged commit 1ab15af into opensearch-project:main Mar 13, 2026
44 of 50 checks passed
aparajita31pandey pushed a commit to aparajita31pandey/OpenSearch that referenced this pull request Apr 18, 2026
…es (opensearch-project#20542)

This change enables copy_to functionality for geo_point fields when
they are represented as objects or arrays. Previously, copy_to would
fail for geo_point fields because the parser consumed tokens during
the initial parse, leaving nothing for the copy_to operation.

The fix serializes complex field values (objects/arrays) to bytes
before parsing, allowing the same data to be parsed multiple times
for both the original field and copy_to targets.

Changes:
- Add parseFieldWithCopyTo() method to handle field parsing with
  copy_to support for both simple and complex value types
- Add parseChildToBytes() helper to serialize current parser state
- Extend parseCopyFields() to accept byte array for reparsing
- Add test case for geo_point with copy_to attribute

Signed-off-by: krocky-cooky <kurotaku9679.sub@gmail.com>
Signed-off-by: Takuo Kuroki <kurotaku9679.sub@gmail.com>
Signed-off-by: Aparajita Pandey <aparajita31pandey@gmail.com>
pradeep-L pushed a commit to pradeep-L/OpenSearch that referenced this pull request Apr 21, 2026
…es (opensearch-project#20542)

This change enables copy_to functionality for geo_point fields when
they are represented as objects or arrays. Previously, copy_to would
fail for geo_point fields because the parser consumed tokens during
the initial parse, leaving nothing for the copy_to operation.

The fix serializes complex field values (objects/arrays) to bytes
before parsing, allowing the same data to be parsed multiple times
for both the original field and copy_to targets.

Changes:
- Add parseFieldWithCopyTo() method to handle field parsing with
  copy_to support for both simple and complex value types
- Add parseChildToBytes() helper to serialize current parser state
- Extend parseCopyFields() to accept byte array for reparsing
- Add test case for geo_point with copy_to attribute

Signed-off-by: krocky-cooky <kurotaku9679.sub@gmail.com>
Signed-off-by: Takuo Kuroki <kurotaku9679.sub@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working Indexing Indexing, Bulk Indexing and anything related to indexing

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG] CopyTo does not work with geo_point field in a specific document.

3 participants