Original issue: #49028
Feature branch: field-retrieval
Docs: https://www.elastic.co/guide/en/elasticsearch/reference/7.x/search-fields.html
Motivation
Often a user wants to retrieve a particular set of fields during a search. Currently, we don't support this usage pattern in a good way. In short, given a list of fields, there is no easy way to load all of their values:
- We can’t load all of them from doc values. Some fields like text fields may not have doc values at all, or we may exceed the limit for a reasonable number of doc value fields to load.
- It’s not easy to load all of them through source. For example, if the field is a field alias, it’s difficult to determine where to find its value in the source.
Better field retrieval support is becoming even more important now that we're introducing more field types that don’t fit the typical pattern like constant_keyword and the proposed runtime fields (#48063).
Feature Summary
We plan to add a new fields section to the search request, which users would specify instead of using source filtering to load fields from source:
POST logs-*/_search
{
"query": { "match_all": {} },
"fields": [
"file.*",
{
"field": "event.timestamp",
"format": "epoch_millis"
},
...
]
}
Both full field names and wildcard patterns are accepted. Only leaf fields are returned, the API will not allow for fetching object values. The fields are returned as a flat list in the fields section in each hit, the same as we do for docvalue_fields and script_fields.
Overall, the API gives a friendly way to load fields from source:
- If a non-standard field like a field alias, multi-field, or constant_keyword is specified in
fields, then we’ll consult the mappings to find and return the right value.
- The fields are returned in a flat list, as opposed to structured JSON.
- For date and numeric field types, we would support the same
format parameter as we do for docvalue_fields to allow for adjusting the format of the results.
- Each value would be returned in a 'canonical' format -- for example if a field is mapped as an integer, it will be returned as an integer even if it was specified as a string in the _source.
Some clarifications:
- In this first pass, the API will not attempt to load from stored fields or doc values.
- For simplicity of parsing, values will always be returned in an array, even if there is only one value present.
Implementation Plan
Future improvements:
Open Questions
- If a wildcard pattern matches both a parent field and one of its multi-fields, should we just return the parent to avoid returning the same value twice? A similar question holds for field aliases and their target fields.
- Should the API return fields in
_source that have been disabled in the mappings (enabled: false)?
- For
keyword fields, should we apply the normalizer or return the original value?
Original issue: #49028
Feature branch: field-retrieval
Docs: https://www.elastic.co/guide/en/elasticsearch/reference/7.x/search-fields.html
Motivation
Often a user wants to retrieve a particular set of fields during a search. Currently, we don't support this usage pattern in a good way. In short, given a list of fields, there is no easy way to load all of their values:
Better field retrieval support is becoming even more important now that we're introducing more field types that don’t fit the typical pattern like
constant_keywordand the proposed runtime fields (#48063).Feature Summary
We plan to add a new
fieldssection to the search request, which users would specify instead of using source filtering to load fields from source:Both full field names and wildcard patterns are accepted. Only leaf fields are returned, the API will not allow for fetching object values. The fields are returned as a flat list in the
fieldssection in each hit, the same as we do fordocvalue_fieldsandscript_fields.Overall, the API gives a friendly way to load fields from source:
fields, then we’ll consult the mappings to find and return the right value.formatparameter as we do fordocvalue_fieldsto allow for adjusting the format of the results.Some clarifications:
Implementation Plan
fieldssection in the search request that fetches values from source. (Add a simple 'fetch fields' phase. #55639)ignore_malformed. (Allow field mappers to retrieve fields from source. #56928)ignore_above. (Fix casting of scaled_float in sorts (backport of #57207) #57385)formatparameter. (Deprecte Rounding#round (backport #57845) #57893)null_value. (Respect null_value parameter in the fields retrieval API. #58623)Future improvements:
FieldMapper#lookupValuestoMappedFieldType. (?)_size.sourcedocuments to speed upsourceaccess #52591.inner_hits.Open Questions
_sourcethat have been disabled in the mappings (enabled: false)?keywordfields, should we apply thenormalizeror return the original value?