Skip to content

storage: prevent future timestamps from leaking #33632

@nvb

Description

@nvb

Related to #7526.
Resulted from a discussion on #33523.

We are lacking discipline about creating hlc timestamps, updating our local hlc clock, and sending hlc timestamps to other nodes. We should rethink which timestamps are used to update remote nodes and where these timestamps are allowed to come from.

@andreimatei:

@tbg , me and Nathan are discussing things here and would like some opinion.
The issue is that we're about to put something in the ts cache. That something needs to be <= the node's current clock.
The value that we're about to put is the timestamp to which we're pushing the pushee. This timestamp comes from the pusher (another node) is pushing to a timestamp <= its clock. The PushTxn request doesn't have a header.Txn filled in [1]. I think this means that it also doesn't have header.Timestamp filled in (as clients don't put timestamps on non-transactional requests; the timestamps for these requests are filled in by the store). So, I think this all means that somewhere we need to be doing r.store.Clock().Update(pushee.Timestamp), as the patch has below. I suggested we do that during evaluation, in evalPushTxn(). What do you think?

Moreover, the patch does reply.PusheeTxn.Timestamp = args.PushTo.Add(0, 1) in evalPushTxn because it might need to write the txn record with that timestamp. And so, particularly since we're generating a timestamp out of thin air, I think the node's clock should be updated immediately to account for that generated timestamp, right?

[1]

if cArgs.Header.Txn != nil {

Maybe the argument that the node's clock should be below any logical timestamp we operate with is not quite funded - in applyTimestampCache we do nextTS := rTS.Next() which I guess can return a value above the local clock, and we seem cool with that.

@tbg:

Yes, this (calling .Next() and thus leaking a possible future timestamp) is a problem that we should address (though it's hopefully only a theoretical one). The problem is likely theoretical because reading a clock value from the hlc already also increments the logical (if it doesn't already do it to the wall time), so you need multiple logical ticks to pile up for something to get out of order. The code for the fix would be

nextTS := rTS.Next()
hlc.Update(nextTS)

but it's starting to really irk me that we're grabbing the hlc lock in so many places (this can be optimized if we had a "slightly stale" reading of the clock, because most of the time nextTS is below anyway and so nothing needs to be updated).

The PushTxn issue is also an issue. The fact that we upgrade the hlc "manually" is bad. The way this could work conceptually is that each request to another node has a hlc.Now() reading that is applied at the recipient before doing anything else with that message.

Metadata

Metadata

Assignees

No one assigned

    Labels

    A-kv-clientRelating to the KV client and the KV interface.C-cleanupTech debt, refactors, loose ends, etc. Solution not expected to significantly change behavior.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions