-
Notifications
You must be signed in to change notification settings - Fork 4.1k
storage: prevent future timestamps from leaking #33632
Description
Related to #7526.
Resulted from a discussion on #33523.
We are lacking discipline about creating hlc timestamps, updating our local hlc clock, and sending hlc timestamps to other nodes. We should rethink which timestamps are used to update remote nodes and where these timestamps are allowed to come from.
@tbg , me and Nathan are discussing things here and would like some opinion.
The issue is that we're about to put something in the ts cache. That something needs to be <= the node's current clock.
The value that we're about to put is the timestamp to which we're pushing the pushee. This timestamp comes from the pusher (another node) is pushing to a timestamp <= its clock. ThePushTxnrequest doesn't have aheader.Txnfilled in [1]. I think this means that it also doesn't haveheader.Timestampfilled in (as clients don't put timestamps on non-transactional requests; the timestamps for these requests are filled in by the store). So, I think this all means that somewhere we need to be doingr.store.Clock().Update(pushee.Timestamp), as the patch has below. I suggested we do that during evaluation, inevalPushTxn(). What do you think?
Moreover, the patch does
reply.PusheeTxn.Timestamp = args.PushTo.Add(0, 1)inevalPushTxnbecause it might need to write the txn record with that timestamp. And so, particularly since we're generating a timestamp out of thin air, I think the node's clock should be updated immediately to account for that generated timestamp, right?
[1]
if cArgs.Header.Txn != nil {
Maybe the argument that the node's clock should be below any logical timestamp we operate with is not quite funded - in applyTimestampCache we do nextTS := rTS.Next() which I guess can return a value above the local clock, and we seem cool with that.
@tbg:
Yes, this (calling
.Next()and thus leaking a possible future timestamp) is a problem that we should address (though it's hopefully only a theoretical one). The problem is likely theoretical because reading a clock value from the hlc already also increments the logical (if it doesn't already do it to the wall time), so you need multiple logical ticks to pile up for something to get out of order. The code for the fix would be
nextTS := rTS.Next() hlc.Update(nextTS)
but it's starting to really irk me that we're grabbing the hlc lock in so many places (this can be optimized if we had a "slightly stale" reading of the clock, because most of the time nextTS is below anyway and so nothing needs to be updated).
The PushTxn issue is also an issue. The fact that we upgrade the hlc "manually" is bad. The way this could work conceptually is that each request to another node has a
hlc.Now()reading that is applied at the recipient before doing anything else with that message.