Skip to content

Push Txns & Resolve Intents on Conflicts#72

Merged
spencerkimball merged 3 commits intomasterfrom
spencerkimball/resolve-write-intents
Sep 21, 2014
Merged

Push Txns & Resolve Intents on Conflicts#72
spencerkimball merged 3 commits intomasterfrom
spencerkimball/resolve-write-intents

Conversation

@spencerkimball
Copy link
Copy Markdown
Member

Automatically push transactions on write intent failures and resolve
conflicting write intents. This is done at the level of
Store.ExecuteCmd on command failure. Write/write conflicts are
pushed and resolved but then must still return the write intent
error to the client so the client can retry with a new client command
ID, ensuring idempotence. This way, all original requests (the one
which hit the conflict) will always return a write intent error via
the response cache. Follow-on requests with the new client command
ID can succeed, and with new ClientCmdID, will always succeed on
retries. Read/write conflicts are automatically retried instead of
returning the write intent error, because they don't have the same
idempotency issues with client command ID.

Reorg'd a bit of code between store and range to move range-specific
checks into range (addition of Range.AddCmd method). Also, added an
ability to submit commands to Raft without having to always wait for
completion. This is used to "fire and forget" resolve intent cmds.
Making sure we cleanup entries pending in the "read queue" is slightly
tricky in Range.addReadWriteCmd.

Moved writeIntentError and writeTooOldError from mvcc.go to the proto/
subdirectory as they're communicated across RPCs from server to clients.
Add the conflicting Key and also a Resolved flag to WriteIntentError.
This allows the client to decide whether to backoff (if resolve failed)
or retry immediately.

Automatically push transactions on write intent failures and resolve
conflicting write intents. This is done at the level of
Store.ExecuteCmd on command failure. Write/write conflicts are
pushed and resolved but then must still return the write intent
error to the client so the client can retry with a new client command
ID, ensuring idempotence. This way, all original requests (the one
which hit the conflict) will always return a write intent error via
the response cache. Follow-on requests with the new client command
ID can succeed, and with new ClientCmdID, will always succeed on
retries. Read/write conflicts are automatically retried instead of
returning the write intent error, because they don't have the same
idempotency issues with client command ID.

Reorg'd a bit of code between store and range to move range-specific
checks into range (addition of Range.AddCmd method). Also, added an
ability to submit commands to Raft without having to always wait for
completion. This is used to "fire and forget" resolve intent cmds.
Making sure we cleanup entries pending in the "read queue" is slightly
tricky in Range.addReadWriteCmd.

Moved writeIntentError and writeTooOldError from mvcc.go to the proto/
subdirectory as they're communicated across RPCs from server to clients.
Add the conflicting Key and also a Resolved flag to WriteIntentError.
This allows the client to decide whether to backoff (if resolve failed)
or retry immediately.
@spencerkimball
Copy link
Copy Markdown
Member Author

@bdarnell @tschottdorf @cockroachdb/developers

storage/range.go Outdated
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A log.V(1) is a pretty weak form of "surfaced somewhere"; I'd make this at least a log.Info. I might also log at V(1) even if wait is true.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed this to a warning and added a V(1) info in Range.executeCmd to always log execution of commands.

@bdarnell
Copy link
Copy Markdown
Contributor

LGTM

storage/store.go Outdated
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we may add a log here to see how frequently it may happen.

We may potentially resubmit the command here immediately without a new ClientCmdID to optimize the latency. (need to work around the respCache)

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a V(1) log for the method as a whole. I figured we can get a bit fancier with optimizations in future changes. Right now, wanted to make sure we keep our idempotency guarantees. We'll need to start optimizing all of this at some point, but let's do it right, using appropriate benchmarks, and avoid premature optimization.

@jiangmingyang
Copy link
Copy Markdown
Contributor

LGTM

spencerkimball added a commit that referenced this pull request Sep 21, 2014
…intents

Push Txns & Resolve Intents on Conflicts
@spencerkimball spencerkimball merged commit efc7718 into master Sep 21, 2014
@spencerkimball spencerkimball deleted the spencerkimball/resolve-write-intents branch September 21, 2014 19:13
tbg referenced this pull request in tbg/cockroach Aug 31, 2016
We have been seeing long startup times which disappear spontaneously. During a
restart of the beta cluster, the following goroutine was observed, which suggests
that we were spending a lot of time GCing replicas on startup.

    engine              ??:0                     _Cfunc_DBIterNext(cockroachdb#324, cockroachdb#323, 0, 0, 0, 0, 0, 0, 0)
    engine              rocksdb.go:1135          (*rocksDBIterator).Next(cockroachdb#235)
    storage             replica_data_iter.go:104 (*ReplicaDataIterator).Next(cockroachdb#316)
    storage             store.go:1748            (*Store).destroyReplicaData(#109, cockroachdb#317, 0, 0)
    storage             store.go:841             (*Store).Start.func2(0x101b, cockroachdb#300, 0x36, 0x40, cockroachdb#301, 0x36, 0x40, cockroachdb#315, 0x3, 0x4, ...)
    storage             store.go:734             IterateRangeDescriptors.func1(cockroachdb#306, 0x40, 0x41, cockroachdb#307, 0x92, 0x92, cockroachdb#341, 0, 0x186c, 0x4000, ...)
    engine              mvcc.go:1593             MVCCIterate(cockroachdb#329, #68, #47, #81, cockroachdb#232, 0x9, 0x10, cockroachdb#233, 0xb, 0x10, ...)
    storage             store.go:738             IterateRangeDescriptors(cockroachdb#330, cockroachdb#196, #47, #81, cockroachdb#195, #179, #110)
    storage             store.go:867             (*Store).Start(#109, cockroachdb#330, cockroachdb#196, #179, #185, 0x1)
    server              node.go:405              (*Node).initStores(#78, cockroachdb#330, cockroachdb#196, #98, 0x1, 0x1, #179, 0, #55)
    server              node.go:330              (*Node).start(#78, cockroachdb#330, cockroachdb#196, #42, #129, #98, 0x1, 0x1, 0, 0, ...)
    server              server.go:431            (*Server).Start(#5, cockroachdb#330, cockroachdb#196, #95, 0x1)
    cli                 start.go:368             runStart(#34, #178, 0, 0x9, 0, 0)
    cobra               command.go:599           (*Command).execute(#34, #177, 0x9, 0x9, #34, #177)
    cobra               command.go:689           (*Command).ExecuteC(#33, #70, cockroachdb#343, #72)
    cobra               command.go:648           (*Command).Execute(#33, #71, cockroachdb#343)
    cli                 cli.go:96                Run(#64, 0xa, 0xa, 0, 0)
    main                main.go:37               main()
tbg referenced this pull request in tbg/cockroach Aug 31, 2016
We have been seeing long startup times which disappear spontaneously. During a
restart of the beta cluster, the following goroutine was observed, which suggests
that we were spending a lot of time GCing replicas on startup.

    engine              ??:0                     _Cfunc_DBIterNext(cockroachdb#324, cockroachdb#323, 0, 0, 0, 0, 0, 0, 0)
    engine              rocksdb.go:1135          (*rocksDBIterator).Next(cockroachdb#235)
    storage             replica_data_iter.go:104 (*ReplicaDataIterator).Next(cockroachdb#316)
    storage             store.go:1748            (*Store).destroyReplicaData(#109, cockroachdb#317, 0, 0)
    storage             store.go:841             (*Store).Start.func2(0x101b, cockroachdb#300, 0x36, 0x40, cockroachdb#301, 0x36, 0x40, cockroachdb#315, 0x3, 0x4, ...)
    storage             store.go:734             IterateRangeDescriptors.func1(cockroachdb#306, 0x40, 0x41, cockroachdb#307, 0x92, 0x92, cockroachdb#341, 0, 0x186c, 0x4000, ...)
    engine              mvcc.go:1593             MVCCIterate(cockroachdb#329, #68, #47, #81, cockroachdb#232, 0x9, 0x10, cockroachdb#233, 0xb, 0x10, ...)
    storage             store.go:738             IterateRangeDescriptors(cockroachdb#330, cockroachdb#196, #47, #81, cockroachdb#195, #179, #110)
    storage             store.go:867             (*Store).Start(#109, cockroachdb#330, cockroachdb#196, #179, #185, 0x1)
    server              node.go:405              (*Node).initStores(#78, cockroachdb#330, cockroachdb#196, #98, 0x1, 0x1, #179, 0, #55)
    server              node.go:330              (*Node).start(#78, cockroachdb#330, cockroachdb#196, #42, #129, #98, 0x1, 0x1, 0, 0, ...)
    server              server.go:431            (*Server).Start(#5, cockroachdb#330, cockroachdb#196, #95, 0x1)
    cli                 start.go:368             runStart(#34, #178, 0, 0x9, 0, 0)
    cobra               command.go:599           (*Command).execute(#34, #177, 0x9, 0x9, #34, #177)
    cobra               command.go:689           (*Command).ExecuteC(#33, #70, cockroachdb#343, #72)
    cobra               command.go:648           (*Command).Execute(#33, #71, cockroachdb#343)
    cli                 cli.go:96                Run(#64, 0xa, 0xa, 0, 0)
    main                main.go:37               main()
soniabhishek pushed a commit to soniabhishek/cockroach that referenced this pull request Feb 15, 2017
…onversion

Changes to get the correct start time and end time for random wait ca…
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants