Skip to content

consolidate query _name and boost support in query DSL #11744

@javanna

Description

@javanna

As part of the query refactoring we are trying to make support for boost and _name in our queries more generic and consistent (#10776) as it is currently copy pasted in all query parsers and builders, sometimes forgotten, and error prone. While working on that, I realized that _name and boost are supported slightly differently in the json depending on the type query. The main difference, besides bugs, is that some queries support both within the top level query object:

{
  "multi_match" : {
    "query" : "test",
    "fields" : ["field1", "field2"],
    "_name" : "query_name",
    "boost" : 10
  }
}

while others support them within the inner object named like the field that gets queried:

{
  "term" : {
    "field_name" : {
      "value" : "test",
      "_name" : "query_name",
      "boost" : 10
    }
  }
}

or even the following:

{
  "term" : {
    "field_name" : "value" : "test",
    "_name" : "query_name",
    "boost" : 10
  }
}

The following is a summary of all of the queries and their behaviour:

Query _name top level _name inner object boost top level boost inner object
and X
bool X X
boosting X
common_terms X X
constant_score X
dis_max X X
exists X
field_masking_span X X
filtered X X
function_score X
fuzzy X X
geo_bbox ** X
geo_distance ** X
geo_distance_range ** X
geohash_cell
geo_polygon ** X
geo_shape ** X X
has_child X X
has_parent X X
ids X X
indices X
limit
match_all X
match X X
missing X X
more_like_this X X
multi_match X X
nested X X
not X
or X
prefix ** X X X
query_string X X
range ** X X
regexp ** X X X
script X
simple_query_string X
span_containing X X
span_first X X
span_multi
span_near X X
span_not X X
span_or X X
span_term X X
span_within X X
term ** X X X X
terms X X
type
wildcard X X

** : queries that support the inner object named as the field that gets queried, but support _name on the top level object instead.

All of the queries that support _name and boost within the inner object do so in their long version. The corresponding short version usually doesn't support _name and boost, but only a single field in the form field: value.

  1. The main question is where should the _name be supported. Should it always be in the top level object, as I was expecting without looking at the code, or does it make sense to wrap everything in the inner object if present?

  2. Queries that don't support _name at all: boosting, constant_score, function_score, geohash_cell, limit, match_all, span_multi, type. Support for _name needs to be added to these?

  3. Queries that don't support boost: indices, simple_query_string (PR opened), span_multi. Support for boost should be added to these?

  4. Filters that don't support boost: and, bool, exists, geo_bbox, geo_distance, geo_distance_range, geohash_cell, geo_polygon, limit, missing, not, or, script, type. Now that filters and queries are merged, can these be used as queries? If so would the boost make sense if it was supported here too?

  5. The following are inconsistencies that surely need to be fixed, according also to outcome of this discussion:

Metadata

Metadata

Assignees

No one assigned

    Labels

    :Search/SearchSearch-related issues that do not fall into other categories

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions