Document Seq No powered optimistic concurrency control#37284
Document Seq No powered optimistic concurrency control#37284bleskes merged 5 commits intoelastic:masterfrom
Conversation
|
Pinging @elastic/es-distributed |
zuketo
left a comment
There was a problem hiding this comment.
Looks good, just a few minor suggestions
docs/reference/docs/bulk.asciidoc
Outdated
| === Optimistic Concurrency Control | ||
|
|
||
| Each index and delete bulk item can include the `if_seq_no` and `if_primary_term` | ||
| parameters in their respective action and meta data lines. These parameters |
There was a problem hiding this comment.
Each index and delete action within a bulk API call may include the if_seq_no and if_primary_term parameters in their respective action and meta data lines.
docs/reference/docs/bulk.asciidoc
Outdated
| Each index and delete bulk item can include the `if_seq_no` and `if_primary_term` | ||
| parameters in their respective action and meta data lines. These parameters | ||
| allow controlling how these operations will be performed based on the last | ||
| modification to existing documents. See <<optimistic-concurrency-control>> for more details. |
There was a problem hiding this comment.
The if_seq_no and if_primary_term parameters control how operations are executed, based on the most recently modified version of an existing document.
There was a problem hiding this comment.
Adopted the first part. I left out the "modified version" suggestion because it's not about the document version but rather the seq# and primary term of the modification.
| allow controlling how these operations will be performed based on the last | ||
| modification to existing documents. See <<optimistic-concurrency-control>> for more details. | ||
|
|
||
|
|
There was a problem hiding this comment.
Can we include a very short example for passing the if_seq_no and if_primary_term parameters to the bulk API?
There was a problem hiding this comment.
good question. we don't do it for other parameters, like routing etc. I think we should consistent and adding it to all may be very verbose.
| Elasticsearch needs a way of ensuring that an older version of a document never | ||
| overwrites a newer version. | ||
|
|
||
| To ensure this, every operation performed to a document is assigned a sequence number |
There was a problem hiding this comment.
To ensure an older version of a document doesn't overwrite a newer version, every operation performed is assigned a sequence number ...
| To ensure this, every operation performed to a document is assigned a sequence number | ||
| by the primary shard that coordinates that change. The sequence number is increased | ||
| with each operation and thus newer operations are guaranteed to have a higher sequence | ||
| number. Elasticsearch can then use the sequence number of operations to make sure they |
| never override a newer document version is never overridden by a change that has a | ||
| smaller sequence number assigned to it. | ||
|
|
||
| For example, the following indexing command will create a document and assign it its |
|
|
||
|
|
||
| Elasticsearch keeps tracks of the sequence number and primary of the last | ||
| operation to have changed each of the document it stores. These are returned |
There was a problem hiding this comment.
Elasticsearch keeps tracks of the sequence number and primary term for every document it stores, this value will change based on the last operation to modify the document. The _seq_no and _primary_term fields are returned
There was a problem hiding this comment.
I will adopt the second part of the suggestion, but I feel the first part is not accurate - there's no sequence number of a document but rather a seq of the last operation to have changed the doc.
| for each search hit by requesting the `_seq_no` and `_primary_term` <<search-request-docvalue-fields,Doc Value Fields>>. | ||
|
|
||
| The sequence number and the primary term uniquely identify a change. By noting down | ||
| the sequence number and primary term return, you can make sure to only change the |
| is done by setting the `if_seq_no` and `if_primary_term` parameters of either the | ||
| <<docs-index_,Index API>> or the <<docs-delete,Delete API>>. | ||
|
|
||
| For example, the following indexing call, will make sure to add a tag to the |
There was a problem hiding this comment.
no comma needed:
For example, the following indexing call will make sure to add a tag to the
|
Thanks @zuketo . I addressed your comments. Can you please take another look? |
| allow controlling how these operations will be performed based on the last | ||
| modification to existing documents. See <<optimistic-concurrency-control>> for more details. | ||
|
|
||
|
|
| operation and thus newer operations are guaranteed to have a higher sequence | ||
| number than older operations. Elasticsearch can then use the sequence number of | ||
| operations to make sure they never override a newer document version is never | ||
| overridden by a change that has a smaller sequence number assigned to it. |
There was a problem hiding this comment.
something is wrong with this sentence.
|
|
||
| The above will succeed since the the supplied version of 2 is higher than | ||
| the current document version of 1. If the document was already updated | ||
| and it's version was set to 2 or higher, the indexing command will fail |
| WARNING: External versioning supports the value 0 as a valid version number. | ||
| This allows the version to be in sync with an external versioning system | ||
| where version numbers start from zero instead of one. It has the side effect | ||
| that documents with version number equal to zero cannot neither be updated |
| database, as long as version numbers from the source database are used. | ||
| Even the simple case of updating the Elasticsearch index using data from | ||
| a database is simplified if external versioning is used, as only the | ||
| latest version will be used if the index operations are out of order for |
| the different version types and their semantics. | ||
|
|
||
| `internal`:: only index the document if the given version is identical to the version | ||
| of the stored document. |
There was a problem hiding this comment.
should we point to optimistic concurrency control here and make it clear that using if_seq_no is the preferred method for this?
There was a problem hiding this comment.
I decided to only do this once internal versioning is deprecated. I can add a note if you prefer
There was a problem hiding this comment.
I decided to only do this once internal versioning is deprecated
ok, if the deprecation is being added to 6.6, I'm fine with that.
There was a problem hiding this comment.
I don't think we'll make the 6.6 time frame as we need to first remove all usages of it (only reindex is left)
|
@ywelsch sorry, I didn't see your comments on the page I've hit the merge button on. I'll address them in a follow up commit. |
Add documentation to describe the new sequence number powered optimistic concurrency control
Relates #36148
Relates #10708