Skip to content

RFC: range-local intent resolution#1873

Merged
tbg merged 1 commit intocockroachdb:masterfrom
tbg:intent_rfc
Jul 31, 2015
Merged

RFC: range-local intent resolution#1873
tbg merged 1 commit intocockroachdb:masterfrom
tbg:intent_rfc

Conversation

@tbg
Copy link
Copy Markdown
Member

@tbg tbg commented Jul 30, 2015

No description provided.

@tbg tbg added the PTAL label Jul 30, 2015
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can disallow cross-coordinator transactions without adding a new field. The first request in a transaction is already special since it does not include a transaction ID. We could move the creation of the heartbeat goroutine from sendOne to maybeBeginTxn and return an error if we receive a txn ID that is not in our txns map.

If the sequence number is useful for other purposes then it may be worth including, but I'd like to avoid it if we can because it removes opportunities for parallelism (with a sequence number the only way to parallelize operations is to include them in the same batch; without a sequence number requests may be sent separately in parallel as long as the client is smart enough to merge all the transaction objects before calling EndTransaction).

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. Happy to ditch the sequence number.

@bdarnell
Copy link
Copy Markdown
Contributor

LGTM

@tbg tbg force-pushed the intent_rfc branch 4 times, most recently from 9e812b9 to c5a5936 Compare July 30, 2015 20:13
@mrtracy
Copy link
Copy Markdown
Contributor

mrtracy commented Jul 30, 2015

LGTM. This should reduce the mental complexity of our transaction model.

@mrtracy
Copy link
Copy Markdown
Contributor

mrtracy commented Jul 30, 2015

Actually, I did have a question: won't this make it difficult to use transactions when consuming Cockroach over HTTP, in the case where external traffic goes through a load balancer? If your transaction spans multiple round-trips, I wouldn't think there is a guarantee that a VIP will send the individual requests to the same machine.

I'm hardly an expert on the subject, but that seems like a relatively common configuration.

@bdarnell
Copy link
Copy Markdown
Contributor

Yes, this means that you won't be able to use load balancers that don't enforce stickiness. This may be a good thing, since that configuration would be wasteful anyway (since multiple coordinators would start heartbeat loops).

If we wanted to, we could encode the coordinator's node ID in the proto.Transaction, and forward requests accordingly.

@tbg tbg force-pushed the intent_rfc branch 3 times, most recently from ad1b029 to d50a6ad Compare July 31, 2015 07:09
@tbg
Copy link
Copy Markdown
Member Author

tbg commented Jul 31, 2015

@mrtracy yeah. Whenever that becomes an issue, there are ways out with some added complexity. For example, the client collecting the intents, or what Ben suggested. Heartbeating may need to be revisited when we make those changes, but it'll be doable. Added that to the comment.

@tbg
Copy link
Copy Markdown
Member Author

tbg commented Jul 31, 2015

I finished disallowing coordinator-spanning transactions, but since conflict resolution in its current form actually starts coordinator-spanning Txns all the time, I have to untangle that first before I can post the progress for review.

tbg added a commit that referenced this pull request Jul 31, 2015
RFC: range-local intent resolution
@tbg tbg merged commit 9c56b9a into cockroachdb:master Jul 31, 2015
@tbg tbg deleted the intent_rfc branch July 31, 2015 20:29
@tbg tbg removed the PTAL label Jul 31, 2015
tbg added a commit to tbg/cockroach that referenced this pull request Aug 2, 2015
a subgoal of cockroachdb#1873 is making sure that a transaction can only write
on one coordinator (actually, the RFC proposes limiting reads to a
single coordinator as well, but this has proven unnecessary).

this change is carried out here. as a slight change from the RFC,
we do add an additional flag to the proto.Transaction struct, which
contains the information of whether previous requests in the Txn have
written data (i.e. laid down intents).
not doing that would have mandated giving up on the optimizations for
read-only transactions since we would have had to rely on special-
casing the first transactional request.
tbg added a commit to tbg/cockroach that referenced this pull request Aug 6, 2015
this passes all tests except for one (which I've disabled; maybe
@bdarnell has an idea?). Basically carried out a lot of the work announced in
RFC cockroachdb#1873:

- coordinator sends intents along with EndTransaction
- range resolves all it possibly can in the EndTransaction batch, and hands
  everything else off to an async resolver.
- "skipped intents" passed up by the range_command functions are now executed
  even if the command returns an error. This is required for EndTransaction
  which finds out that it's been aborted.

There are a lot of TODOs documented in the code, and I intend to address all
of those that are relevant for correctness. Some parts of the RFC are also
not yet implemented (gc'ing Txn entries), but the road should be very clear
and I'll add those before merging.

I did want to get this out early because it's a fairly complex refactoring
and will only get bigger from here.
@tbg tbg mentioned this pull request Aug 12, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants