Skip to content

Updating the FieldMappers to insert call addField of DocumentInput#21001

Open
darjisagar7 wants to merge 3 commits intoopensearch-project:mainfrom
darjisagar7:DocumentInput
Open

Updating the FieldMappers to insert call addField of DocumentInput#21001
darjisagar7 wants to merge 3 commits intoopensearch-project:mainfrom
darjisagar7:DocumentInput

Conversation

@darjisagar7
Copy link
Copy Markdown
Contributor

@darjisagar7 darjisagar7 commented Mar 26, 2026

Description

This PR adds the following functionalities:

  1. Introducing a new Feature Flag for the Pluggable dataformat feature. [RFC] OpenSearch Engine: Pluggable Component Bundling #20644
  2. It adds a new IndexSettings named "index.pluggable.dataformat.enabled", which can be used along with feature flag to enable the pluggable dataformat for an Index.
  3. This PR also ensures that the FieldMapper are decoupled from the Lucene's code. Based on the pluggable dataformat architecture the data will be added to DocumentInput which will add the data to respective dataformat.

Related Issues

Resolves #[Issue number to be closed when this PR is merged]

Check List

  • Functionality includes testing.
  • API changes companion pull request created, if applicable.
  • Public documentation issue/PR created, if applicable.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@github-actions
Copy link
Copy Markdown
Contributor

Failed to generate code suggestions for PR

@github-actions
Copy link
Copy Markdown
Contributor

❌ Gradle check result for 7f26163: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Mar 26, 2026

PR Code Analyzer ❗

AI-powered 'Code-Diff-Analyzer' found issues on commit 78ffe1a.

PathLineSeverityDescription
server/src/test/java/org/opensearch/index/mapper/FakeStringFieldMapper.java138lowparseCreateFieldForPluggableFormat() writes to context.doc() (Lucene document) instead of context.documentInput(), which is inconsistent with the pluggable format contract established in all other mappers. When the pluggable dataformat path is active this method is called instead of parseCreateField(), so the field ends up in the Lucene document rather than the DocumentInput, silently defeating the pluggable-format routing logic. This is a test helper class so impact is limited to tests, but the pattern is anomalous compared to every other implementation in this diff.

The table above displays the top 10 most important findings.

Total: 1 | Critical: 0 | High: 0 | Medium: 0 | Low: 1


Pull Requests Author(s): Please update your Pull Request according to the report above.

Repository Maintainer(s): You can bypass diff analyzer by adding label skip-diff-analyzer after reviewing the changes carefully, then re-run failed actions. To re-enable the analyzer, remove the label, then re-run all actions.


⚠️ Note: The Code-Diff-Analyzer helps protect against potentially harmful code patterns. Please ensure you have thoroughly reviewed the changes beforehand.

Thanks.

@github-actions
Copy link
Copy Markdown
Contributor

Failed to generate code suggestions for PR

@github-actions
Copy link
Copy Markdown
Contributor

Failed to generate code suggestions for PR

@github-actions
Copy link
Copy Markdown
Contributor

✅ Gradle check result for 93426fa: SUCCESS

@codecov
Copy link
Copy Markdown

codecov bot commented Mar 26, 2026

Codecov Report

❌ Patch coverage is 89.36170% with 30 lines in your changes missing coverage. Please review.
✅ Project coverage is 73.31%. Comparing base (d26ad17) to head (e74c078).
⚠️ Report is 3 commits behind head on main.

Files with missing lines Patch % Lines
...mapper/annotatedtext/AnnotatedTextFieldMapper.java 0.00% 9 Missing ⚠️
.../opensearch/index/mapper/size/SizeFieldMapper.java 0.00% 3 Missing ⚠️
...rg/opensearch/join/mapper/ParentIdFieldMapper.java 66.66% 1 Missing and 1 partial ⚠️
...h/index/mapper/ICUCollationKeywordFieldMapper.java 88.88% 0 Missing and 2 partials ⚠️
...earch/index/mapper/murmur3/Murmur3FieldMapper.java 75.00% 1 Missing and 1 partial ⚠️
...opensearch/index/mapper/FlatObjectFieldMapper.java 92.00% 1 Missing and 1 partial ⚠️
...va/org/opensearch/index/mapper/HllFieldMapper.java 86.66% 1 Missing and 1 partial ⚠️
...g/opensearch/index/mapper/WildcardFieldMapper.java 83.33% 1 Missing and 1 partial ⚠️
...pensearch/index/mapper/ScaledFloatFieldMapper.java 94.44% 0 Missing and 1 partial ⚠️
...earch/index/mapper/SearchAsYouTypeFieldMapper.java 88.88% 0 Missing and 1 partial ⚠️
... and 4 more
Additional details and impacted files
@@             Coverage Diff              @@
##               main   #21001      +/-   ##
============================================
- Coverage     73.35%   73.31%   -0.05%     
- Complexity    73209    73303      +94     
============================================
  Files          5921     5921              
  Lines        333798   333979     +181     
  Branches      48124    48155      +31     
============================================
- Hits         244862   244850      -12     
- Misses        69387    69564     +177     
- Partials      19549    19565      +16     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 3, 2026

Persistent review updated to latest commit bff52e7

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 3, 2026

Persistent review updated to latest commit 6ce9abf

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 3, 2026

❌ Gradle check result for 6ce9abf: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 3, 2026

Persistent review updated to latest commit 6ce9abf

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 3, 2026

✅ Gradle check result for 6ce9abf: SUCCESS

Comment on lines +678 to +683
@Override
protected void parseCreateFieldForPluggableFormat(ParseContext context) throws IOException {
final String value = context.externalValueSet() ? context.externalValue().toString() : context.parser().textOrNull();
if (value == null) {
return;
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe you could further extract the common logic

private String extractValue(ParseContext context) throws IOException {
    String value = context.externalValueSet()
        ? context.externalValue().toString()
        : context.parser().textOrNull();
    return value;
}

Comment on lines +399 to +402
protected final boolean isPluggableDataFormatFeatureEnabled(ParseContext parseContext) {
return isPluggableDataFormatEnabled(parseContext.indexSettings().getSettings());
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can cache this instead of reading from index settings everytime?

protected MultiFields multiFields;
protected CopyTo copyTo;
protected DerivedFieldGenerator derivedFieldGenerator;
private final AtomicBoolean isPluggableDataFormatFeatureEnabled = new AtomicBoolean(false);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems unused, prefer using volatile once you implement the caching logic

Comment on lines +616 to +624
assert parser.currentToken() == XContentParser.Token.START_OBJECT;
parser.nextToken(); // Skip the outer START_OBJECT. Need to return on END_OBJECT.

LinkedList<String> path = new LinkedList<>(Collections.singleton(fieldType().name()));
HashSet<String> pathParts = new HashSet<>();
while (parser.currentToken() != XContentParser.Token.END_OBJECT) {
parseToken(parser, context, path, pathParts);
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This logic is again redundant

Comment on lines +77 to +87
/**
* Gates the functionality of pluggable dataformat feature.
*/
public static final String PLUGGABLE_DATAFORMAT_EXPERIMENTAL_FLAG = FEATURE_FLAG_PREFIX + "pluggable.dataformat.enabled";

public static final Setting<Boolean> PLUGGABLE_DATAFORMAT_EXPERIMENTAL_SETTING = Setting.boolSetting(
PLUGGABLE_DATAFORMAT_EXPERIMENTAL_FLAG,
false,
Property.NodeScope
);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we gating behind both feature flag and index settings?

Sagar Darji added 2 commits April 4, 2026 00:21
1. Introducing FeatureFlag and Index Setting for pluggable dataformat feature.
2. Updating the FieldMappers to insert fields in DocumentInput for Multi Format Engine

Signed-off-by: Sagar Darji <darsaga@amazon.com>

# Conflicts:
#	CHANGELOG.md
…Mapper class

Signed-off-by: Sagar Darji <darsaga@amazon.com>
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 3, 2026

Persistent review updated to latest commit 2290cfc

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 3, 2026

❌ Gradle check result for 2290cfc: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@darjisagar7 darjisagar7 closed this Apr 4, 2026
@darjisagar7 darjisagar7 reopened this Apr 4, 2026
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 4, 2026

Persistent review updated to latest commit 2290cfc

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 4, 2026

❕ Gradle check result for 2290cfc: UNSTABLE

Please review all flaky tests that succeeded after retry and create an issue if one does not already exist to track the flaky failure.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 4, 2026

Persistent review updated to latest commit e74c078

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 4, 2026

✅ Gradle check result for e74c078: SUCCESS


@Override
protected void parseCreateFieldForPluggableFormat(ParseContext context) throws IOException {
// TODO check how we can support addWithKey for pluggable dataformat
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should throw here if it is not supported?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree, I can throw an exception here

* @throws UnsupportedOperationException if the mapper does not support pluggable data formats
*/
@ExperimentalApi
protected abstract void parseCreateFieldForPluggableFormat(ParseContext context) throws IOException;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can't have this abstract as this breaks everything that implements FieldMapper outside of the plugin. This should throw unsuspported?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With pluggable dataformat if lucene is selected then every functionality that is working today should work as it is so adding the abstract here ensures that this function is Overriden at all the FieldMapper places.

context.documentInput().addField(fieldType(), value);
}

private String parseKeywordValue(ParseContext context) throws IOException {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice! how are we testing we are testing all such common functionalities continue to work? I see have a CapturingDocumentInput which is being run against the tests written, but how do we ensure that existing tests can also work there?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current tests are working as it is we have not modified them, so that should give us the confidence that existing flow is working as expected. Or do we need some other mechanism to confirm this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants