backupccl: fix off by one index in fileSSTSink file extension#98041
backupccl: fix off by one index in fileSSTSink file extension#98041craig[bot] merged 1 commit intocockroachdb:masterfrom
Conversation
3c046b5 to
0ab3e4e
Compare
Currently, the logic that extends the last flushed file fileSSTSink does not trigger if there is only one flushed file. This failure to extend the first flushed file can result in file entries in the backup manifest with duplicate start keys. For example, if the first export response written to the sink contains partial entries of a single key `a`, then the span of the first file will be `a-a`, and the span of the subsequent file will always be `a-<end_key>`. The presence of these duplicate start keys breaks the encoding of the external manifest files list SST as the file path + start key combination in the manifest are assumed to be unique. Release note: None
|
bors r+ |
|
Build succeeded: |
| // ended and has the same time-bounds -- then we can simply extend that span | ||
| // and add to its entry counts. Otherwise we need to record it separately. | ||
| if l := len(s.flushedFiles) - 1; l > 0 && s.flushedFiles[l].Span.EndKey.Equal(span.Key) && | ||
| if l := len(s.flushedFiles) - 1; l >= 0 && s.flushedFiles[l].Span.EndKey.Equal(span.Key) && |
There was a problem hiding this comment.
Can this logic be expressed in a way that a future person won’t re-introduce an off-by-one?
- The
-1isn’t great -- how can a reader infer why that’s there? - The
>=isn’t great either. Most loop bounds should have a clean<. - Maybe the several conditions on
s.flushedFiles[l]should be inside the loop, with abreakorcontinue. - Some well-named local vars would increase clarity.
(Bikeshed, can this be something other than l which looks like 1?)
There was a problem hiding this comment.
Postmortem analysis of the above is kind of pointless unless we can replay the stream of thoughts exactly at the time OBOE was introduced. The choice of the mnemonic to denote the last position may have played some role, but I doubt that was the reason; i.e., l usually stands for length; I would have used last. More likely, the issue is the cognitive dissonance of 0-based and 1-based indexing, both of which are prevalent. E.g., when we talk about nodes in the cluster, we refer to n1 as the first; there are plenty of other examples.
|
@rhu713 we should backport this to all release branches? |
|
blathers backport 22.2 22.1 |
|
Encountered an error creating backports. Some common things that can go wrong:
You might need to create your backport manually using the backport tool. error creating merge commit from 6b39306 to blathers/backport-release-22.2-98041: POST https://api.github.com/repos/cockroachdb/cockroach/merges: 409 Merge conflict [] you may need to manually resolve merge conflicts with the backport tool. Backport to branch 22.2 failed. See errors above. error creating merge commit from 6b39306 to blathers/backport-release-22.1-98041: POST https://api.github.com/repos/cockroachdb/cockroach/merges: 409 Merge conflict [] you may need to manually resolve merge conflicts with the backport tool. Backport to branch 22.1 failed. See errors above. 🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf. |
Currently, the logic that extends the last flushed file fileSSTSink does not trigger if there is only one flushed file. This failure to extend the first flushed file can result in file entries in the backup manifest with duplicate start keys. For example, if the first export response written to the sink contains partial entries of a single key
a, then the span of the first file will bea-a, and the span of the subsequent file will always bea-<end_key>. The presence of these duplicate start keys breaks the encoding of the external manifest files list SST as the file path + start key combination in the manifest are assumed to be unique.Fixes #97953
Release note: None