Skip to content

Add copy ingest processor#11870

Merged
msfroh merged 3 commits intoopensearch-project:mainfrom
gaobinlong:copy
Jan 16, 2024
Merged

Add copy ingest processor#11870
msfroh merged 3 commits intoopensearch-project:mainfrom
gaobinlong:copy

Conversation

@gaobinlong
Copy link
Copy Markdown
Contributor

@gaobinlong gaobinlong commented Jan 12, 2024

Description

This PR adds a new ingest processor called copy processor which can copy the whole object from one existing field to another field, this is useful when users want to copy a nested field to the root level and then delete the original field, this cannot be achieved by set processor because it doesn't support copying object, only basic data types are supported. In addition, even though script processor can be used to copy object, but writing painless script is not easy for users. The copy processor provides an easy way to copy object.

The usage of copy processor are as follows:

POST _ingest/pipeline/_simulate
{
  "pipeline": {
    "processors": [
      {
        "copy": {
          "source_field": "{{foo}}", 
          "target_field":"target",
          "ignore_missing":true,
          "override_target":true,
          "remove_source":true
        }
      }
    ]
  },
  "docs": [
    {
      "_version_type":"external_gte",
      "_version": 1,
      "_source": {
        "foo":"a", 
        "a": {
          "c":"1"
        },
        "target":1
      }
    }
  ]
}

, the result is:

{
  "docs": [
    {
      "doc": {
        "_index": "_index",
        "_id": "_id",
        "_version": "1",
        "_version_type": "external_gte",
        "_source": {
          "foo": "a",
          "target": {
            "c": "1"
          }
        },
        "_ingest": {
          "timestamp": "2024-01-12T13:44:27.287123Z"
        }
      }
    }
  ]
}

.
Both the source_field and target_field support template snippets, and there are three extra parameters:

  • ignore_missing: if true, exit quietly when source_field doesn't exist or has a empty field path, defaults to false
  • override_target: if true, override the value of target_field if it already exists, defaults to false
  • remove_source: if true, remove the source_field after the copy operation, defaults to false

Related Issues

#10134

Check List

  • New functionality includes testing.
    • All tests pass
  • New functionality has been documented.
    • New functionality has javadoc added
  • Failing checks are inspected and point to the corresponding known issue(s) (See: Troubleshooting Failing Builds)
  • Commits are signed per the DCO using --signoff
  • Commit changes are listed out in CHANGELOG.md file (See: Changelog)
  • Public documentation issue/PR created

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Signed-off-by: Gao Binlong <gbinlong@amazon.com>
Signed-off-by: Gao Binlong <gbinlong@amazon.com>
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Jan 12, 2024

Compatibility status:

Checks if related components are compatible with change 441fde8

Incompatible components

Incompatible components: [https://github.com/opensearch-project/security-analytics.git, https://github.com/opensearch-project/observability.git, https://github.com/opensearch-project/custom-codecs.git, https://github.com/opensearch-project/cross-cluster-replication.git, https://github.com/opensearch-project/asynchronous-search.git, https://github.com/opensearch-project/reporting.git, https://github.com/opensearch-project/sql.git, https://github.com/opensearch-project/index-management.git, https://github.com/opensearch-project/performance-analyzer.git, https://github.com/opensearch-project/alerting.git, https://github.com/opensearch-project/anomaly-detection.git, https://github.com/opensearch-project/performance-analyzer-rca.git]

Skipped components

Compatible components

Compatible components: [https://github.com/opensearch-project/geospatial.git, https://github.com/opensearch-project/opensearch-oci-object-storage.git, https://github.com/opensearch-project/notifications.git, https://github.com/opensearch-project/neural-search.git, https://github.com/opensearch-project/job-scheduler.git, https://github.com/opensearch-project/security.git, https://github.com/opensearch-project/ml-commons.git, https://github.com/opensearch-project/common-utils.git, https://github.com/opensearch-project/k-nn.git]

@github-actions
Copy link
Copy Markdown
Contributor

❌ Gradle check result for d8edaf3: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@github-actions
Copy link
Copy Markdown
Contributor

❌ Gradle check result for 0595d7e: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Signed-off-by: Gao Binlong <gbinlong@amazon.com>
@github-actions
Copy link
Copy Markdown
Contributor

❕ Gradle check result for 441fde8: UNSTABLE

  • TEST FAILURES:
      1 org.opensearch.remotestore.RemoteIndexPrimaryRelocationIT.testPrimaryRelocationWhileIndexing

Please review all flaky tests that succeeded after retry and create an issue if one does not already exist to track the flaky failure.

@codecov
Copy link
Copy Markdown

codecov bot commented Jan 12, 2024

Codecov Report

Attention: 4 lines in your changes are missing coverage. Please review.

Comparison is base (5c82ab8) 71.43% compared to head (441fde8) 71.45%.

Files Patch % Lines
...va/org/opensearch/ingest/common/CopyProcessor.java 93.61% 3 Missing ⚠️
...search/ingest/common/IngestCommonModulePlugin.java 0.00% 1 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##               main   #11870      +/-   ##
============================================
+ Coverage     71.43%   71.45%   +0.01%     
+ Complexity    59407    59376      -31     
============================================
  Files          4921     4922       +1     
  Lines        278989   279037      +48     
  Branches      40543    40550       +7     
============================================
+ Hits         199287   199374      +87     
+ Misses        63086    63045      -41     
- Partials      16616    16618       +2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Copy Markdown
Contributor

@msfroh msfroh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gaobinlong -- thanks for doing this! I really like this approach.

Can you link to the document issue for the new processor?

@deshsidd
Copy link
Copy Markdown
Contributor

LGTM

@gaobinlong
Copy link
Copy Markdown
Contributor Author

@gaobinlong -- thanks for doing this! I really like this approach.

Can you link to the document issue for the new processor?

Thanks, I've done that, and I'll open a document PR for this processor.

@deshsidd
Copy link
Copy Markdown
Contributor

Anything remaining here? Can we merge?

@msfroh msfroh merged commit 6d2d4dd into opensearch-project:main Jan 16, 2024
@dblock
Copy link
Copy Markdown
Member

dblock commented Jan 17, 2024

Do you want this in 2.x? Backport?

@msfroh msfroh added the backport 2.x Backport to 2.x branch label Jan 17, 2024
opensearch-trigger-bot bot pushed a commit that referenced this pull request Jan 17, 2024
---------

Signed-off-by: Gao Binlong <gbinlong@amazon.com>
(cherry picked from commit 6d2d4dd)
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
dblock pushed a commit that referenced this pull request Jan 17, 2024
---------


(cherry picked from commit 6d2d4dd)

Signed-off-by: Gao Binlong <gbinlong@amazon.com>
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
@gaobinlong
Copy link
Copy Markdown
Contributor Author

Do you want this in 2.x? Backport?

Yeah, see this PR has been backported to 2.x yet, thank you all @dblock @msfroh @deshsidd.

@reta
Copy link
Copy Markdown
Contributor

reta commented Jan 22, 2024

New flaky tests #11974

peteralfonsi pushed a commit to peteralfonsi/OpenSearch that referenced this pull request Mar 1, 2024

---------

Signed-off-by: Gao Binlong <gbinlong@amazon.com>
rayshrey pushed a commit to rayshrey/OpenSearch that referenced this pull request Mar 18, 2024

---------

Signed-off-by: Gao Binlong <gbinlong@amazon.com>
shiv0408 pushed a commit to Gaurav614/OpenSearch that referenced this pull request Apr 25, 2024
---------

Signed-off-by: Gao Binlong <gbinlong@amazon.com>
Signed-off-by: Shivansh Arora <hishiv@amazon.com>
@reta reta mentioned this pull request Jul 17, 2024
3 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport 2.x Backport to 2.x branch

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants