Store Template's mappings as bytes for disk serialization by probakowski · Pull Request #78746 · elastic/elasticsearch

probakowski · 2021-10-06T10:29:43Z

This change the way we store mappings in Template during serialization to disk - instead storing it as map we use byte array that we already have. This avoids deserialization-serialization cycle during storing cluster state on disk.

original-brownbear

I think we should do this more like we do for mappings (maybe even reuse some code from there) to keep things simpler. No need to do any hacks around reading base64 from JSON.

original-brownbear · 2021-10-07T07:31:06Z

server/src/main/java/org/elasticsearch/cluster/metadata/Template.java

-            if (uncompressedMapping.size() > 0) {
-                builder.field(MAPPINGS.getPreferredName());
-                builder.map(reduceMapping(uncompressedMapping));
+            if (Metadata.CONTEXT_MODE_GATEWAY.equals(params.param(Metadata.CONTEXT_MODE_PARAM, Metadata.CONTEXT_MODE_API))) {


It would be nice if this followed the exact same conventions we use for mappings, respecting the binary parameter and serializing as bytes for everything but the API.

I've added binary parameter and changed usage to similar to IndexMetada

original-brownbear · 2021-10-07T07:31:55Z

server/src/main/java/org/elasticsearch/cluster/metadata/Template.java

+            Object compressed = values.get("compressed");
+            if (compressed == null) {
+                return new CompressedXContent(Strings.toString(XContentFactory.jsonBuilder().map(values)));
+            } else if (compressed instanceof String) {


No need for this if we do things the same way we do them for mappings.

This handles situation when you use binary parameter with JsonXContent - it stores byte arrays as BASE64 encoded strings. Without it ToAndFromJsonMetadataTests#testSimpleJsonFromAndTo fails

original-brownbear · 2021-10-07T07:37:16Z

test/framework/src/main/java/org/elasticsearch/test/XContentTestUtils.java

            } else {
                return path + ": first element is null, the second element is not null";
            }
+        } else if (first instanceof byte[]) {


Probably also not necessary to extend things this way if we just do what we do for mappings and have a special binary param path here I think.

This is required for ESIntegTestCase#ensureClusterStateCanBeReadByNodeTool to pass - this test uses binary serialization and this is the first instance where we have to compare byte[] in it.

Why don't we have that for mappings already, they should be binary serialized as well and appear to run through the exact same code?

There was different path with parsing IndexMetadata->bytes->IndexMetada and ComposableIndexMetadata->bytes->UnknownMetadataCustom when using ElasticsearchNodeCommand#namedXContentRegistry so results differed when using binary serialization. I changed ElasticsearchNodeCommand to parse ComposableIndexMetadata so the code is no longer needed.

probakowski · 2021-10-12T10:13:15Z

@elasticmachine update branch

probakowski · 2021-10-12T10:54:04Z

@elasticmachine run elasticsearch-ci/part-2

probakowski · 2021-10-12T22:10:02Z

@original-brownbear I simplified code and made it similar to IndexMetadata, would you mind taking another look?

original-brownbear

Still not sure why we have to do anything different than we do with mappings here in the tests and in part of production code. Can you explain what's different here?

original-brownbear · 2021-10-13T10:51:09Z

server/src/main/java/org/elasticsearch/cluster/metadata/Template.java

-            new CompressedXContent(Strings.toString(XContentFactory.jsonBuilder().map(p.mapOrdered()))), MAPPINGS);
+        PARSER.declareField(ConstructingObjectParser.optionalConstructorArg(), (p, c) -> {
+            XContentParser.Token token = p.currentToken();
+            if (token == XContentParser.Token.VALUE_STRING) {


Where do we encode the template like this? This seems broken?
We don't seem to run into a base64 JSON string anywhere for mappings so we shouldn't for templates?

This shows up with combination of JsonXContent and binary parameter. I think we have that only in ToAndFromJsonMetadataTests#testSimpleJsonFromAndTo. Thing is we don't serialize IndexMetadata there because Metadata.toXContent will filter them out if context is not XContentContext.API so we don't have to serialize mappings for them as well. Test like below would fail exactly because we don't handle VALUE_STRING case in IndexMetadata, we just don't test it. It would work with SmileXContent.

IndexMetadata.Builder indexMetadataBuilder = IndexMetadata.builder("test12") .settings(settings(Version.CURRENT)) .numberOfShards(1) .numberOfReplicas(0) .putMapping("{\"mapping1\":{\"text1\":{\"type\":\"string\"}}}"); Map<String, String> params = new HashMap<>(2); params.put("binary", "true"); params.put(Metadata.CONTEXT_MODE_PARAM, Metadata.CONTEXT_MODE_GATEWAY); XContentBuilder builder = JsonXContent.contentBuilder(); builder.startObject(); indexMetadataBuilder.build().toXContent(builder, new ToXContent.MapParams(params)); builder.endObject(); IndexMetadata indexMetadata = IndexMetadata.fromXContent(createParser(builder)); assertNotNull(indexMetadata.mapping());

original-brownbear · 2021-10-13T10:51:34Z

test/framework/src/main/java/org/elasticsearch/test/XContentTestUtils.java

            } else {
                return path + ": first element is null, the second element is not null";
            }
+        } else if (first instanceof byte[]) {


Why don't we have that for mappings already, they should be binary serialized as well and appear to run through the exact same code?

probakowski · 2021-10-14T12:23:20Z

@elasticmachine update branch

probakowski · 2021-10-14T17:55:30Z

@elasticmachine update branch

probakowski · 2021-10-19T07:47:33Z

@elasticmachine update branch

probakowski · 2021-10-19T20:41:58Z

@elasticmachine update branch

probakowski · 2021-10-19T21:03:54Z

@elasticmachine update branch

elasticsearchmachine · 2021-10-19T22:06:50Z

💔 Backport failed

Status	Branch	Result
❌	7.x	Commit could not be cherrypicked due to conflicts

You can use sqren/backport to manually backport by running backport --upstream elastic/elasticsearch --pr 78746

) This change the way we store mappings in `Template` during serialization to disk - instead storing it as map we use byte array that we already have. This avoids deserialization-serialization cycle during storing cluster state on disk. # Conflicts: # server/src/main/java/org/elasticsearch/cluster/coordination/ElasticsearchNodeCommand.java

…79522) This change the way we store mappings in `Template` during serialization to disk - instead storing it as map we use byte array that we already have. This avoids deserialization-serialization cycle during storing cluster state on disk. # Conflicts: # server/src/main/java/org/elasticsearch/cluster/coordination/ElasticsearchNodeCommand.java

* upstream/master: (24 commits) Implement framework for migrating system indices (elastic#78951) Improve transient settings deprecation message (elastic#79504) Remove getValue and getValues from Field (elastic#79516) Store Template's mappings as bytes for disk serialization (elastic#78746) [ML] Add queue_capacity setting to start deployment API (elastic#79433) [ML] muting rest compat test issue elastic#79518 (elastic#79519) Avoid redundant available indices check (elastic#76540) Re-enable BWC tests TEST Ensure password 14 chars length on Kerberos FIPS tests (elastic#79496) [DOCS] Temporarily remove APM links (elastic#79411) Fix CCSDuelIT for skipped shards (elastic#79490) Add other time accounting in HotThreads (elastic#79392) Add deprecation info API entries for deprecated monitoring settings (elastic#78799) Add note in breaking changes for nameid_format (elastic#77785) Use 'migration' instead of 'upgrade' in GET system feature migration status responses (elastic#79302) Upgrade lucene version 8b68bf60c98 (elastic#79461) Use Strings#EMPTY_ARRAY (elastic#79452) Quicker shared cache file preallocation (elastic#79447) [ML] Removing some code that's obsolete for 8.0 (elastic#79444) Ensure indexing_data CCR requests are compressed (elastic#79413) ...

elasticmachine · 2021-12-03T17:16:35Z

Pinging @elastic/es-data-management (Team:Data Management)

Store mappings as bytes for disk serialization

a17e437

probakowski added >enhancement v8.0.0 v7.16.0 labels Oct 6, 2021

probakowski requested review from martijnvg and original-brownbear October 6, 2021 10:29

probakowski added 5 commits October 6, 2021 13:13

Test fix

1bbccb6

rollback changes

d490007

fix tests

0880079

Merge remote-tracking branch 'origin/master' into template-fix

25bb42d

fix serialization with json

256618f

original-brownbear reviewed Oct 7, 2021

View reviewed changes

probakowski added 4 commits October 7, 2021 09:59

change serialization

f36868c

Merge branch 'master' into template-fix

1473a23

rework

f0457af

Merge remote-tracking branch 'origin/master' into template-fix

a257b20

probakowski changed the title ~~Store mappings as bytes for disk serialization~~ Store Template's mappings as bytes for disk serialization Oct 11, 2021

probakowski added 2 commits October 11, 2021 20:17

fix compilation

0d29911

Merge remote-tracking branch 'origin/master' into template-fix

579e139

Merge branch 'master' into template-fix

fda4aa9

probakowski requested a review from original-brownbear October 12, 2021 22:09

original-brownbear reviewed Oct 13, 2021

View reviewed changes

probakowski self-assigned this Oct 14, 2021

parse ComposableTemplateMetadata with node tool

1758f8d

elasticmachine and others added 2 commits October 14, 2021 23:23

Merge branch 'master' into template-fix

e1bd326

parse ComponentTemplate in node tool

58a691a

probakowski added 3 commits October 14, 2021 16:43

rollback unneeded part

5b0408f

rollback unneeded part

f45b195

rollback unneeded part

77f744d

Merge branch 'master' into template-fix

81265a8

probakowski requested a review from original-brownbear October 14, 2021 20:48

original-brownbear approved these changes Oct 19, 2021

View reviewed changes

Merge branch 'master' into template-fix

d8f4f86

Merge branch 'master' into template-fix

e4ccdaa

Merge branch 'master' into template-fix

58016b3

probakowski added auto-backport-and-merge auto-merge-without-approval Automatically merge pull request when CI checks pass (NB doesn't wait for reviews!) labels Oct 19, 2021

elasticsearchmachine merged commit 675c1f4 into elastic:master Oct 19, 2021

probakowski mentioned this pull request Oct 19, 2021

[7.x] Store Template's mappings as bytes for disk serialization (#78746) #79522

Merged

probakowski deleted the template-fix branch October 19, 2021 22:28

jakelandis added v8.0.0-beta1 and removed v8.0.0 labels Oct 27, 2021

danhermann added the :Data Management/Indices APIs DO NOT USE. Use ":Distributed/Indices APIs" or ":StorageEngine/Templates" instead. label Dec 3, 2021

elasticmachine added the Team:Data Management (obsolete) DO NOT USE. This team no longer exists. label Dec 3, 2021

arteam mentioned this pull request May 6, 2024

[CI] RestoreTemplateWithMatchOnlyTextMapperIT test failing #107515

Closed

Conversation

probakowski commented Oct 6, 2021

Uh oh!

original-brownbear left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

probakowski commented Oct 12, 2021

Uh oh!

probakowski commented Oct 12, 2021

Uh oh!

probakowski commented Oct 12, 2021

Uh oh!

original-brownbear left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

probakowski Oct 14, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

probakowski commented Oct 14, 2021

Uh oh!

probakowski commented Oct 14, 2021

Uh oh!

probakowski commented Oct 19, 2021

Uh oh!

probakowski commented Oct 19, 2021

Uh oh!

probakowski commented Oct 19, 2021

Uh oh!

elasticsearchmachine commented Oct 19, 2021

💔 Backport failed

Uh oh!

elasticmachine commented Dec 3, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

probakowski Oct 14, 2021 •

edited

Loading