Skip to content

Explore disabling the _all field by default #19784

@dakrone

Description

@dakrone

We should explore the idea of disabling the _all field by default. For
background, the _all field contains the contents of each of a document's
field. During indexing, these contents are copied to the _all field, analyzed
with a specific analyzer, and then indexed. At query time, some queries such as
the query_string and simple_query_string queries search the _all field if
no fields are specified.

There are a number of issues with the _all field:

  • It uses a fair amount of disk space, due to duplicating the values from all
    other fields to an additional field
  • _all has its own analyzer, which is confusing when expecting to query
    text analyzed a certain way (or with synonyms, for example) and discovering
    that it does not match due to a analysis difference
  • Additional indexing overhead caused by data duplication
  • Since the _all field is not retrievable or part of _source, its contents
    cannot easily be inspected for debugging purposes, some users do not know it
    even exists, which causes confusion at query time
  • Better alternatives exist, such as the copy_to value on mappings which can
    be used to create custom _all fields

If we were to change the _all field to be disabled by default, some queries
would have to be handled differently. The query_string and
simple_query_string queries would have to know a default field or fields to
query. Perhaps we would be able to change these queries to send a "fields": ["*"] parameter if no fields are specified in the JSON query, so that all field
values can still be queried.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions