Skip to content

Add finer control over _source retrieval, in get, mget, get_source, explain & search API #3301

@bleskes

Description

@bleskes

At the moment all of the above API offer the field parameter to retrieve part of the stored documents. However, the fields option was built to expose Lucene's stored fields and thus has some limitations when use to extract data from _source. The most important one is potentially flatting the document structure.

This feature adds a new parameter that allows directly retrieving parts of the _source, without conforming to the store fields structure.

To maintain backward compatibility, you can still retrieve the _source by specifying fields=["_source"] but this special treatment will be removed in the future.

Get API

The Get api parameters are supplied via the query string. New _source,_source_include & _source_exclude parameters are added, according to the following:

A flag to control _source retrieval

curl -XGET 'http://localhost:9200/index/type/1?_source=false'

or (default)

curl -XGET 'http://localhost:9200/index/type/1?_source=true'

Only retrieve part of the source

curl -XGET 'http://localhost:9200/index/type/1?_source=title,author'

or

curl -XGET 'http://localhost:9200/index/type/1?_source_include=title,content&_source_exclude=content.full_text'

Multi Get API

The Multi Get API allows you to control _source both on the query string (same syntax as the get API) or on a per document basis.

Query String defaults

curl -XGET 'http://localhost:9200/index/type/_mget?_source=false' -d'{
  ids: [1, 2, 3]
}'

or

curl -XGET 'http://localhost:9200/index/type/_mget?_source_include=title,content&_source_exclude=content.full_text' -d'{
  ids: [1, 2, 3]
}'

etc.

Per document settings

curl -XGET 'http://localhost:9200/_mget' -d '{
    docs: [
        { "_index": "test" , _type: "type1", "_id": "1", "_source": false },
        { "_index": "test" , _type: "type1", "_id": "2", "_source": "title" },
        { "_index": "test" , _type: "type1", "_id": "3", "_source": [ "title", "author" ] },
        { "_index": "test" , _type: "type1", "_id": "4", 
          "_source": { "include": "content" , "exclude" : "content.full_text" }  
        },
        { "_index": "test" , _type: "type1", "_id": "5", 
          "_source": { "include": [ "title", "content" ] , "exclude" : [ "content.full_text" ]}  
        }
    ]
}'

Get_source API

The get/_source API is an API that is already dedicated for _source retrieval. As such, it has a slightly different parameter naming:

curl -XGET 'http://localhost:9200/index/type/1/_source?include=title,content&exclude=content.full_text'

Explain API

The explain API also offers the fields parameter. It is now extend with query string parameters, just like the get API:

curl -XPOST 'http://localhost:9200/index/type/1/_explain?_source=false' -d'{
    "query" : { "term" : { "message" : "search" } }
}'

or

curl -XPOST 'http://localhost:9200/index/type/1/_explain?_source=title,author' -d'{
    "query" : { "term" : { "message" : "search" } }
}'

and

curl -XPOST 'http://localhost:9200/index/type/1/_explain?_source_include=title,content&_source_exclude=content.full_text' -d'{
    "query" : { "term" : { "message" : "search" } }
}'

Search API

The search API was added an extra _source key in the body, with the same options as all the above:

curl -XPOST 'http://localhost:9200/_search' -d'{
    "query" : { "term" : { "message" : "search" } },
    "_source" : false
}'

and

curl -XPOST 'http://localhost:9200/_search' -d'{
    "query" : { "term" : { "message" : "search" } },
    "_source" : "title"
}'
curl -XPOST 'http://localhost:9200/_search' -d'{
    "query" : { "term" : { "message" : "search" } },
    "_source" : [ "title" , "author" ]
}'
curl -XPOST 'http://localhost:9200/_search' -d'{
    "query" : { "term" : { "message" : "search" } },
    "_source": { "include": [ "title", "content" ] , "exclude" : [ "content.full_text" ]} 
}'

Also the search API supports accepting _source retrieval settings as query string parameters. The format is identical to the get API: _source, _source_include & _source_exclude. In the case where the parameters are supplied both in the request body and the query string, the query string parameter override the body.

Metadata

Metadata

Assignees

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions