Skip to content

Render Mappings more Compact in GET /_cluster/state#83846

Closed
original-brownbear wants to merge 2 commits intoelastic:mainfrom
original-brownbear:more-compact-cs-rest-clean
Closed

Render Mappings more Compact in GET /_cluster/state#83846
original-brownbear wants to merge 2 commits intoelastic:mainfrom
original-brownbear:more-compact-cs-rest-clean

Conversation

@original-brownbear
Copy link
Copy Markdown
Contributor

@original-brownbear original-brownbear commented Feb 11, 2022

This deduplicates index mappings in the /_cluster/state response by grouping them by the sha256 hash of the mapping.
This significantly reduces the response size for large cluster states,
making this debug API easier to read for humans (because understanding
whether or not mappings are duplicated across indices might be valuable)
and more importantly less ressource hungry as far as serialization goes.

Adds an undocumented parameter to get back to the old format if we actually have to in an unforeseen instance.

New rendering of an index:

      "my-index-000001" : {
        "version" : 5,
        "mapping_version" : 2,
        "settings_version" : 1,
        "aliases_version" : 1,
        "routing_num_shards" : 1024,
        "state" : "open",
        "settings" : {
          "index" : {
            "routing" : {
              "allocation" : {
                "include" : {
                  "_tier_preference" : "data_content"
                }
              }
            },
            "number_of_shards" : "1",
            "provided_name" : "my-index-000001",
            "creation_date" : "1644601309989",
            "number_of_replicas" : "1",
            "uuid" : "9m19TP4DRsevoKkSG3DC0g",
            "version" : {
              "created" : "8020099"
            }
          }
        },
        "mappings" : "XinOY78Sl3l55aaFJrhIKDlniKVodszNHRmKhONrR/o=",

and then mappings at the same level as "indices":

  "mappings" : {
      "5jySc/VOXLKC1itGKeilXwF6kCFL0i5Xe0DqJ0eDgwk=" : {
        "dynamic" : "strict",
        "_meta" : {
          "version" : "8.2.0"
        },
        "properties" : {
          "chunk" : {
            "type" : "integer"
          },
          "data" : {
            "type" : "binary"
          },
          "name" : {
            "type" : "keyword"
          }
        }
      },
      "XinOY78Sl3l55aaFJrhIKDlniKVodszNHRmKhONrR/o=" : {
        "properties" : {
          "@timestamp" : {
            "type" : "date"
          },
          "message" : {
            "type" : "text",
            "fields" : {
              "keyword" : {
                "type" : "keyword",
                "ignore_above" : 256
              }
            }
          },
          "user" : {
            "properties" : {
              "id" : {
                "type" : "text",
                "fields" : {
                  "keyword" : {
                    "type" : "keyword",
                    "ignore_above" : 256
                  }
                }
              }
            }
          }
        }
      }
    }

relates #77466

This deduplicates index mappings in the `/_cluster_state` response.
This significantly reduces the response size for large cluster states,
making this debug API easier to read for humans (because understanding
whether or not mappings are duplicated across indices might be valuable)
and more importantly less ressource hungry as far as serialization goes.
@original-brownbear original-brownbear added >enhancement :Distributed/Cluster Coordination Cluster formation and cluster state publication, including cluster membership and fault detection. v8.2.0 labels Feb 11, 2022
@elasticmachine elasticmachine added the Team:Distributed Meta label for distributed team. label Feb 11, 2022
@elasticmachine
Copy link
Copy Markdown
Collaborator

Pinging @elastic/es-distributed (Team:Distributed)

@elasticsearchmachine
Copy link
Copy Markdown
Collaborator

Hi @original-brownbear, I've created a changelog YAML for you.

Copy link
Copy Markdown
Member

@DaveCTurner DaveCTurner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally +1 from me but we still need to consider whether to treat this as a breaking change or not. We could alternatively not change the default but start sending 429s to clients that request an unreasonably large response. Marking this as Request changes until these questions are resolved.

Copy link
Copy Markdown
Contributor

@henningandersen henningandersen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Preparing for our discussion today, I read through this and provided a couple of comments.

if (mmd == null) {
builder.nullField(KEY_MAPPINGS);
} else {
builder.field(KEY_MAPPINGS, indexMetadata.mapping().getSha256());
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer to pick a new name for the SHA256. mappings_sha256 maybe or perhaps shared_mappings_id. Then we can also allow dumping both, which could provide a transitional path for clients.


if (context == XContentContext.API) {
if (params.paramAsBoolean(MAPPINGS_BY_HASH_PARAM, true)) {
builder.startObject("mappings");
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps we can call this shared_mappings and then have a flag of the same name to the API as well as a cluster setting to disable dumping non-shared mappings?

@elasticsearchmachine elasticsearchmachine changed the base branch from master to main July 22, 2022 23:09
@mark-vieira mark-vieira added v8.5.0 and removed v8.4.0 labels Jul 27, 2022
Copy link
Copy Markdown
Member

@DaveCTurner DaveCTurner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Henning and I discussed this again today. We're not confident enough that nobody would be affected by this change in the default behaviour, and we think with chunked-encoding we can protect Elasticsearch against harm at least, so we would prefer to make the desired behaviour opt-in today, deprecate the old default, and remove it in the next major version.

I'll raise a proposal to do this with the breaking changes committee.

@csoulios csoulios added v8.6.0 and removed v8.5.0 labels Sep 21, 2022
@kingherc kingherc added v8.7.0 and removed v8.6.0 labels Nov 16, 2022
@rjernst rjernst added v8.8.0 and removed v8.7.0 labels Feb 8, 2023
@gmarouli gmarouli added v8.9.0 and removed v8.8.0 labels Apr 26, 2023
@quux00 quux00 added v8.11.0 and removed v8.10.0 labels Aug 16, 2023
@mattc58 mattc58 added v8.12.0 and removed v8.11.0 labels Oct 4, 2023
@elasticsearchmachine elasticsearchmachine added v9.1.0 Team:Distributed Coordination (obsolete) Meta label for Distributed Coordination team. Obsolete. Please do not use. and removed v9.0.0 labels Jan 30, 2025
@elasticsearchmachine
Copy link
Copy Markdown
Collaborator

Pinging @elastic/es-distributed-obsolete (Team:Distributed (Obsolete))

@elasticsearchmachine
Copy link
Copy Markdown
Collaborator

Pinging @elastic/es-distributed-coordination (Team:Distributed Coordination)

@original-brownbear original-brownbear closed this by deleting the head repository Nov 28, 2025
@pickypg
Copy link
Copy Markdown
Member

pickypg commented Jan 8, 2026

I see that this was closed because it lacked consensus on the breaking change. Why not take this in steps:

  1. Add this as a feature behind a flag that can be opted into
  2. In the next major (or at some arbitrary point where we agree the breaking change is worthwhile), make it the default

What do you think?

@DaveCTurner DaveCTurner removed the Team:Distributed Coordination (obsolete) Meta label for Distributed Coordination team. Obsolete. Please do not use. label Feb 18, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:Distributed/Cluster Coordination Cluster formation and cluster state publication, including cluster membership and fault detection. >enhancement Team:Distributed Meta label for distributed team. v9.3.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.