Some features like random_score use fielddata on _id and our documentation sometimes recommends to sort on _id in order to have a stable sort across pages (eg. https://www.elastic.co/guide/en/elasticsearch/reference/master/search-request-search-after.html). However fielddata on such unique fields has a huge cost if you have significant amounts of data, we should move away from it.
We probably have two options here, which are to either add doc values to _id (#11887) but this proved controversial due to the overhead it adds to the index, or switch to _doc but this has the drawback of not being comparable across shards or after merges.
Some features like
random_scoreuse fielddata on_idand our documentation sometimes recommends to sort on_idin order to have a stable sort across pages (eg. https://www.elastic.co/guide/en/elasticsearch/reference/master/search-request-search-after.html). However fielddata on such unique fields has a huge cost if you have significant amounts of data, we should move away from it.We probably have two options here, which are to either add doc values to
_id(#11887) but this proved controversial due to the overhead it adds to the index, or switch to_docbut this has the drawback of not being comparable across shards or after merges.