[DOCS] Reformat delimited payload token filter docs#49380
[DOCS] Reformat delimited payload token filter docs#49380jrodewig merged 6 commits intoelastic:masterfrom jrodewig:reformat.delimited-payload-token-filter
Conversation
|
Pinging @elastic/es-search (:Search/Analysis) |
|
Pinging @elastic/es-docs (>docs) |
mayya-sharipova
left a comment
There was a problem hiding this comment.
@jrodewig Thanks, nice work, but needs some changes.
| ==== | ||
| A payload is user-defined binary data associated with a token position and | ||
| stored as base64-encoded bytes. Payloads are often used with the | ||
| <<query-dsl-script-score-query,`script_score`>> query to calculate custom scores |
There was a problem hiding this comment.
I guess would be nice to do this some time, but currently there is no way for script_score to access payloads. I don't know any other ES query that can deal with payloads either. The only way to access them is through _termvectors API.
There was a problem hiding this comment.
Thank you for catching this. I've updated this note to state that you can view stored payloads using the term vectors API API.
| `with_positions_offsets_payloads` | ||
|
|
||
| * Use an index analyzer that includes the `delimted_payload` filter | ||
| ==== |
There was a problem hiding this comment.
I am not sure about the correctness of this paragraph. So, for a text/keyword field ES can create 3 Lucene fields: 1) usual indexed field broken into terms and used for search. Here we don't sore payloads (I think) and definitely never use them in any search query. 2) a stored field with index option store:true. But I am not sure why you mentioned it here (it seems to me that it doesn't have to do anything with payloads). 3) a term vectors field with index option of term_vector. these are primarily used for highlighting, but can be used just for retrieval purpose as well (as in your examples). And here we can store payloads but again we don't use payloads for anything intelligent besides just retrieval.
nit: delimted_payload -> delimited_payload
There was a problem hiding this comment.
Thanks for another great catch. I've removed references to the store requirement throughout. I also fixed the typo.
| Use the <<indices-create-index,create index API>> to create an index that: | ||
|
|
||
| * Includes a field that stores payloads. For this field, set the | ||
| <<mapping-store,`store`>> mapping parameter to `true` and the |
There was a problem hiding this comment.
I am not sure why store : true is necessary?
Includes a field that stores payloads -> I would reformulate it to something like "stores term vectors with payloads", as it not a usual indexed field with payloads.
There was a problem hiding this comment.
It's not. I've rephrased this bullet to "Includes a field that stores term vectors with payloads." as you suggested. Thanks!
|
Thanks for your thorough review, @mayya-sharipova. You cleared up some of my misunderstandings around storing payloads and their use cases. This PR is ready for another look at your convenience. Thanks again! |
mayya-sharipova
left a comment
There was a problem hiding this comment.
Thanks @jrodewig, the changes LGTM
mayya-sharipova
left a comment
There was a problem hiding this comment.
Thanks @jrodewig, the changes LGTM
* Adds a title abbreviation * Relocates the older name deprecation warning * Updates the description and adds a Lucene link * Adds a note to explain payloads and how to store them * Adds analyze and custom analyzer snippets * Adds a 'Return stored payloads' example
* Adds a title abbreviation * Relocates the older name deprecation warning * Updates the description and adds a Lucene link * Adds a note to explain payloads and how to store them * Adds analyze and custom analyzer snippets * Adds a 'Return stored payloads' example
* Adds a title abbreviation * Relocates the older name deprecation warning * Updates the description and adds a Lucene link * Adds a note to explain payloads and how to store them * Adds analyze and custom analyzer snippets * Adds a 'Return stored payloads' example
Reformats the delimted payload token filter docs as part of #44726.
Changes
@mayya-sharipova @jtibshirani Would you mind looking at this at your convenience? I want to ensure the added information about payloads and term vectors works. Thanks!