Skip to content

[stability] sql: add special tracing to trap #7881#8681

Merged
andreimatei merged 4 commits intocockroachdb:masterfrom
andreimatei:fix-autocommit
Sep 10, 2016
Merged

[stability] sql: add special tracing to trap #7881#8681
andreimatei merged 4 commits intocockroachdb:masterfrom
andreimatei:fix-autocommit

Conversation

@andreimatei
Copy link
Copy Markdown
Contributor

@andreimatei andreimatei commented Aug 19, 2016

sql: create root spans for transactions

Also:

I've decided to go medieval and invest in tracing all SQL execution in
the hope that #7881 will be triggered again.
The tracing is gated by an env var. When turned on, every SQL txn
creates a new tracer that accumulates spans in the session's txnState
(this is similar to how "SQL tracing" currently works). When we detect
7881, we mark the root span of the current one of these traces for
"sampling", which means that later, when that span (and hence the trace)
is closed, we dump the trace with all the log messages in it (note that
currently this trace only has one span, since we're not very good at
starting child spans yet).

cc @knz, @cockroachdb/stability


This change is Reviewable

@petermattis
Copy link
Copy Markdown
Collaborator

Reviewed 4 of 4 files at r1, 1 of 1 files at r2.
Review status: 1 of 5 files reviewed at latest revision, 10 unresolved discussions, all commit checks successful.


internal/client/txn.go, line 103 [r1] (raw file):

  // initiated on this host.
  // It's also used by "EXPLAIN TRACE".
  // Note that in SQL land there's also TxnState.CollectedSpans. That guys

s/. That guys/which/g


sql/executor.go, line 520 [r3] (raw file):

          results, remainingStmts, err = runTxnAttempt(e, planMaker, origState, txnState, opt, stmtsToExec)

          // Until #7881 fixed.

Add TODO(andrei).


sql/executor.go, line 544 [r3] (raw file):

                  err, txnState.txn.Proto, execOpt.AutoRetry, execOpt.AutoCommit, stmts,
                  remainingStmts)
              // Until #7881 fixed.

TODO(andrei)


sql/executor.go, line 549 [r3] (raw file):

                      "txnState %+v, execOpt %+v, stmts %+v, remaining %+v",
                      err, txnState, execOpt, stmts, remainingStmts)
                  txnState.sp.SetBaggageItem(keyFor7881Sample, "sample me please")

Perhaps hide in a method: txnState.forceTrace().


sql/session.go, line 162 [r1] (raw file):

  State TxnStateEnum

  // baseCtx is used to init the contexts for every SQL txn, as this txnState is

s/baseCtx/Ctx/g


sql/session.go, line 253 [r1] (raw file):

  if ts.txn != nil {
      ts.CollectedSpans = append(ts.CollectedSpans, ts.txn.CollectedSpans...)

Might be worth optimizing the common case where the transaction is only executed once:

if len(ts.CollectedSpans) == 0 {
  ts.CollectedSpans = tx.txn.CollectedSpans
} else {
  ts.CollectedSpans = append(ts.CollectedSpans, ts.txn.CollectedSpans...)
}

sql/session.go, line 55 [r3] (raw file):

// span baggage key used for marking a span
var keyFor7881Sample = "found#7881"

This seems overly specific to 7881. Perhaps keyForceSample.


sql/session.go, line 245 [r3] (raw file):

          opentracing.ContextWithSpan(ctx, sp), tr)
      // The new span will be Finish()ed by FinishSpan().
      ts.sp = sp

Are you not using a snowball trace because of the greater expense?


util/log/structured.go, line 63 [r2] (raw file):

  file, line, _ := caller.Lookup(depth + 1)
  msg := makeMessage(ctx, format, args)
  Trace(ctx, file+":"+strconv.Itoa(line)+" "+msg)

Not sure if we want this. It isn't too hard to figure out which line generated a trace statement. Grep is your friend. @RaduBerinde thoughts?


util/tracing/tracer.go, line 94 [r3] (raw file):

  sp, err := JoinOrNew(tr, nil, "sql txn")
  return sp, tr, err
}

Seems like this belongs in sql, not here.


Comments from Reviewable

@RaduBerinde
Copy link
Copy Markdown
Member

Do we want to retain some of this infrastructure beyond 7881? If so we should change to more generic names

I don't understand one thing - is the intention that we would collect traces from other nodes involved in the txn and dump those too? Because that would only happen if we enable snowball tracing for that txn no? (or am I missing something?)


Review status: 1 of 5 files reviewed at latest revision, 12 unresolved discussions, all commit checks successful.


sql/session.go, line 128 [r1] (raw file):

// reading descriptors consistent with the txn in which they'll be used?
func (s *Session) Ctx() context.Context {
  if s.TxnState.sp != nil {

Seems odd that we check sp here. If sp != nil happens to mean that the transaction is active, move it into a txnState.IsActive() method


sql/session.go, line 164 [r1] (raw file):

  // baseCtx is used to init the contexts for every SQL txn, as this txnState is
  // reset for every txn.
  Ctx context.Context

The comment is deceiving, "is used to init the contexts for every SQL txn" means it is some kind of generic context that is a parent for txn contexts, but from the code it seems it just is the transaction context.. I would rename to txnCtx and clarify the comment.


util/log/structured.go, line 63 [r2] (raw file):

Previously, petermattis (Peter Mattis) wrote…

Not sure if we want this. It isn't too hard to figure out which line generated a trace statement. Grep is your friend. @RaduBerinde thoughts?

The logs already have these and sometimes they're useful (especially if we are logging an error, we may find the place where the error is generated based on the message, but it might be hard to figure out which layer logged it). That said, I haven't worked too much with traces.. An option would also be to put them at the end of the message where they might be less disruptive

Comments from Reviewable

@andreimatei
Copy link
Copy Markdown
Contributor Author

(Thanks guys! Please take another look.)

Well there's no "infrastructure" here to write home about. You know I'm a fan of tracing in principle, and we should have a way to sample everything and be able to get a hold of the current trace. But I don't think this is it, by creating a new tracer per request in a random place and accumulating the trace state in a random object. We should all sit down with Tobi too and have a heart to heart.

I didn't enable snowballing because, according to some comment by Peter, that's the expensive part of "sql tracing", so I didn't want to enable if for everything (and there's no way to enable it per trace). Plus frankly I don't think it'd be very useful for finding this bug.

@RaduBerinde I've made a change in AdminServe about the contexts of sql sessions opened from RPCs. I don't really know what the deal with contexts that come from gRPC is, maybe you do. Please take a look.


Review status: 0 of 7 files reviewed at latest revision, 12 unresolved discussions, some commit checks pending.


internal/client/txn.go, line 103 [r1] (raw file):

Previously, petermattis (Peter Mattis) wrote…

s/. That guys/which/g

Done.

sql/executor.go, line 520 [r3] (raw file):

Previously, petermattis (Peter Mattis) wrote…

Add TODO(andrei).

Done.

sql/executor.go, line 544 [r3] (raw file):

Previously, petermattis (Peter Mattis) wrote…

TODO(andrei)

Done.

sql/executor.go, line 549 [r3] (raw file):

Previously, petermattis (Peter Mattis) wrote…

Perhaps hide in a method: txnState.forceTrace().

i dunno... I wouldn't pollute `txnState` more than I did already.

sql/session.go, line 128 [r1] (raw file):

Previously, RaduBerinde wrote…

Seems odd that we check sp here. If sp != nil happens to mean that the transaction is active, move it into a txnState.IsActive() method

true. We're checking the right thing now. And I've renamed the functions that deal with the span to better suggest that the span is tied to a SQL txn's lifetime.

sql/session.go, line 162 [r1] (raw file):

Previously, petermattis (Peter Mattis) wrote…

s/baseCtx/Ctx/g

Done.

sql/session.go, line 164 [r1] (raw file):

Previously, RaduBerinde wrote…

The comment is deceiving, "is used to init the contexts for every SQL txn" means it is some kind of generic context that is a parent for txn contexts, but from the code it seems it just is the transaction context.. I would rename to txnCtx and clarify the comment.

I went the other way and actually made it a "base context". I made all transactions have a root span, and hence a modified child context. See if you dig. For this, I've had to ensure that the a session's ctx always has a tracer in it. It seems reasonable to me...

sql/session.go, line 253 [r1] (raw file):

Previously, petermattis (Peter Mattis) wrote…

Might be worth optimizing the common case where the transaction is only executed once:

if len(ts.CollectedSpans) == 0 {
  ts.CollectedSpans = tx.txn.CollectedSpans
} else {
  ts.CollectedSpans = append(ts.CollectedSpans, ts.txn.CollectedSpans...)
}
Done.

sql/session.go, line 55 [r3] (raw file):

Previously, petermattis (Peter Mattis) wrote…

This seems overly specific to 7881. Perhaps keyForceSample.

Radu also said something about using more generic names, but I feel like nothing done here is generic, and it all needs to go away, in favor of a more general method of "forced sampling" that works with the global singleton tracer we should be using.

sql/session.go, line 245 [r3] (raw file):

Previously, petermattis (Peter Mattis) wrote…

Are you not using a snowball trace because of the greater expense?

correct. I want to enable this on the beta cluster until I catch the sucker.

util/log/structured.go, line 63 [r2] (raw file):

Previously, RaduBerinde wrote…

The logs already have these and sometimes they're useful (especially if we are logging an error, we may find the place where the error is generated based on the message, but it might be hard to figure out which layer logged it). That said, I haven't worked too much with traces.. An option would also be to put them at the end of the message where they might be less disruptive

To be honest, I think the file and line are the most interesting part of these log messages :) There were definitely some messages that just said "error: xxx" and I had no idea where that's coming from. What's the downside of having them? I didn't put them in `log.Trace()` because maybe there's a cost to walking the stack, but here?

Here's how a trace looks without them (and it's gonna continue to look the same for most messages since they're log.Tracef()s):

I160818 22:48:29.702194 sql/session.go:291  SQL trace:
     0.046ms      0.000ms    executing 1/1: BEGIN TRANSACTION
     0.161ms      0.115ms    executing 1/1: DROP TABLE t.kv
     0.206ms      0.045ms    meta descriptor lookup
     0.355ms      0.149ms    looked up range descriptor
     0.591ms      0.236ms    looked up range descriptor
     0.613ms      0.022ms    sending RPC to [/Min, /Table/11)
     0.747ms      0.134ms    meta descriptor lookup
     0.751ms      0.004ms    looked up range descriptor from cache
     0.759ms      0.008ms    sending RPC to [/Min, /Table/11)
     0.863ms      0.104ms    meta descriptor lookup
     0.867ms      0.003ms    looked up range descriptor from cache
     0.875ms      0.008ms    sending RPC to [/Min, /Table/11)

And here's one with it:

     3.788ms      2.420ms    sql/event_log.go:109 Event: "drop_table", target: 51, info: {TableName:kv Statement:DROP TABLE t.kv User:root}

util/tracing/tracer.go, line 94 [r3] (raw file):

Previously, petermattis (Peter Mattis) wrote…

Seems like this belongs in sql, not here.

I had it in sql initially, but it uses the `defaultOptions` which isn't public... So I'd just keep it in here. I do intend for it to go away... Added the TODO to remove.

Comments from Reviewable

@RaduBerinde
Copy link
Copy Markdown
Member

Sure there is, Snowballing is enabled in a span (and its future children) by setting the snowball baggage item. In short, if you set that baggage item, requests that go out will return populated CollectedSpans (see

// If this is a snowball span, it gets special treatment: It skips the
)


Review status: 0 of 7 files reviewed at latest revision, 12 unresolved discussions, all commit checks successful.


Comments from Reviewable

@andreimatei
Copy link
Copy Markdown
Contributor Author

oh I see, ok. But anyway, by the time I figure out a trace is interesting, it's too late for future snowballing to be useful.


Review status: 0 of 7 files reviewed at latest revision, 12 unresolved discussions, all commit checks successful.


Comments from Reviewable

@RaduBerinde
Copy link
Copy Markdown
Member

Ah, gotcha


Review status: 0 of 7 files reviewed at latest revision, 12 unresolved discussions, all commit checks successful.


Comments from Reviewable

@RaduBerinde
Copy link
Copy Markdown
Member

Review status: 0 of 7 files reviewed at latest revision, 12 unresolved discussions, all commit checks successful.


server/admin.go, line 172 [r6] (raw file):

  ctx context.Context, args sql.SessionArgs,
) *sql.Session {
  ctx = tracing.WithTracer(ctx, tracing.TracerFromCtx(s.server.ctx.Ctx))

Nice, yeah this is in principle what we need to do - we need to "merge" information from an incoming context with information that is relevant to where the request got in now. We will eventually do the same with log tags and whatever else we store in context, but there is no infrastructure for that yet. Maybe add a TODO(radu): figure out a general way to do this


Comments from Reviewable

@tamird
Copy link
Copy Markdown
Contributor

tamird commented Aug 19, 2016

Reviewed 7 of 7 files at r4, 1 of 1 files at r5, 1 of 4 files at r6.
Review status: 4 of 7 files reviewed at latest revision, 14 unresolved discussions, all commit checks successful.


server/admin.go, line 172 [r6] (raw file):

Previously, RaduBerinde wrote…

Nice, yeah this is in principle what we need to do - we need to "merge" information from an incoming context with information that is relevant to where the request got in now. We will eventually do the same with log tags and whatever else we store in context, but there is no infrastructure for that yet. Maybe add a TODO(radu): figure out a general way to do this

What's the point of creating a new context here? Just that all the information from the server's context is removed?

sql/executor.go, line 287 [r4] (raw file):

}

// NewDummyExecutor creates an empty Executor that is used for certain tests.

while you're here, can you put this in a _test.go file?


sql/executor.go, line 572 [r4] (raw file):

      }

      // If we're no longer in a transaction, finish the trace.

does this want to be in a defer right after the txnState.ResetForSomethingSomething?


Comments from Reviewable

@RaduBerinde
Copy link
Copy Markdown
Member

Review status: 4 of 7 files reviewed at latest revision, 14 unresolved discussions, all commit checks successful.


server/admin.go, line 172 [r6] (raw file):

Previously, tamird (Tamir Duberstein) wrote…

What's the point of creating a new context here? Just that all the information from the server's context is removed?

We want to inherit the gRPC context, we are just adding a value to it

Comments from Reviewable

@tamird
Copy link
Copy Markdown
Contributor

tamird commented Aug 19, 2016

Review status: 4 of 7 files reviewed at latest revision, 14 unresolved discussions, all commit checks successful.


server/admin.go, line 172 [r6] (raw file):

Previously, RaduBerinde wrote…

We want to inherit the gRPC context, we are just adding a value to it

Oh, duh.

Comments from Reviewable

@andreimatei
Copy link
Copy Markdown
Contributor Author

Review status: 1 of 7 files reviewed at latest revision, 14 unresolved discussions, some commit checks pending.


server/admin.go, line 172 [r6] (raw file):

Previously, tamird (Tamir Duberstein) wrote…

Oh, duh.

added a TODO

sql/executor.go, line 287 [r4] (raw file):

Previously, tamird (Tamir Duberstein) wrote…

while you're here, can you put this in a _test.go file?

unfortunately it uses the private `cfg` field... I don't know if I should make that public...

sql/executor.go, line 572 [r4] (raw file):

Previously, tamird (Tamir Duberstein) wrote…

does this want to be in a defer right after the txnState.ResetForSomethingSomething?

It wants, but unfortunately it's not that easy. The problem is that SQL transactions can span ping-pongs back to the client, so different invocations of this method. So the state machine that we need to maintain needs to be more explicit.

Comments from Reviewable

@tamird
Copy link
Copy Markdown
Contributor

tamird commented Aug 22, 2016

Reviewed 1 of 6 files at r7, 1 of 1 files at r8, 3 of 4 files at r9.
Review status: 6 of 7 files reviewed at latest revision, 13 unresolved discussions, some commit checks failed.


sql/executor.go, line 287 [r4] (raw file):

Previously, andreimatei (Andrei Matei) wrote…

unfortunately it uses the private cfg field... I don't know if I should make that public...

I meant that you can move this function to `sql/helpers_test.go` so that it is only accessible from tests.

Comments from Reviewable

@andreimatei
Copy link
Copy Markdown
Contributor Author

Review status: 6 of 7 files reviewed at latest revision, 13 unresolved discussions, some commit checks failed.


sql/executor.go, line 287 [r4] (raw file):

Previously, tamird (Tamir Duberstein) wrote…

I meant that you can move this function to sql/helpers_test.go so that it is only accessible from tests.

the test using it is in package `pgwire`, so I think this function can't be in a `_test` within `sql`.

Comments from Reviewable

@tamird
Copy link
Copy Markdown
Contributor

tamird commented Aug 22, 2016

Review status: 6 of 7 files reviewed at latest revision, 13 unresolved discussions, some commit checks failed.


sql/executor.go, line 287 [r4] (raw file):

Previously, andreimatei (Andrei Matei) wrote…

the test using it is in package pgwire, so I think this function can't be in a _test within sql.

Ah, that's true.

Comments from Reviewable

@RaduBerinde
Copy link
Copy Markdown
Member

:lgtm:


Review status: 6 of 7 files reviewed at latest revision, 10 unresolved discussions, some commit checks failed.


util/log/structured.go, line 63 [r2] (raw file):

Previously, andreimatei (Andrei Matei) wrote…

To be honest, I think the file and line are the most interesting part of these log messages :) There were definitely some messages that just said "error: xxx" and I had no idea where that's coming from. What's the downside of having them? I didn't put them in log.Trace() because maybe there's a cost to walking the stack, but here?

Here's how a trace looks without them (and it's gonna continue to look the same for most messages since they're log.Tracef()s):

I160818 22:48:29.702194 sql/session.go:291  SQL trace:
     0.046ms      0.000ms    executing 1/1: BEGIN TRANSACTION
     0.161ms      0.115ms    executing 1/1: DROP TABLE t.kv
     0.206ms      0.045ms    meta descriptor lookup
     0.355ms      0.149ms    looked up range descriptor
     0.591ms      0.236ms    looked up range descriptor
     0.613ms      0.022ms    sending RPC to [/Min, /Table/11)
     0.747ms      0.134ms    meta descriptor lookup
     0.751ms      0.004ms    looked up range descriptor from cache
     0.759ms      0.008ms    sending RPC to [/Min, /Table/11)
     0.863ms      0.104ms    meta descriptor lookup
     0.867ms      0.003ms    looked up range descriptor from cache
     0.875ms      0.008ms    sending RPC to [/Min, /Table/11)

And here's one with it:

     3.788ms      2.420ms    sql/event_log.go:109 Event: "drop_table", target: 51, info: {TableName:kv Statement:DROP TABLE t.kv User:root}
What's annoying is that the file/line may or may not show up depending how it makes it way into the event. E.g. if you use a `VEventf`, it depends on the verbosity whether the file/line shows up. But better to sometimes have more information than be consistent and never have it.

Comments from Reviewable

sql/executor.go Outdated
var err error
results, remainingStmts, err = runTxnAttempt(e, planMaker, origState, txnState, opt, stmtsToExec)

// Until #7881 fixed.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO(andrei) here and elsewhere

@andreimatei andreimatei force-pushed the fix-autocommit branch 2 times, most recently from b185701 to abf2773 Compare September 9, 2016 19:27
@tbg
Copy link
Copy Markdown
Member

tbg commented Sep 9, 2016

Reviewed 1 of 7 files at r4, 6 of 6 files at r7, 7 of 7 files at r10, 1 of 1 files at r11, 5 of 5 files at r13, 4 of 4 files at r14.
Review status: all files reviewed at latest revision, 20 unresolved discussions, some commit checks pending.


internal/client/txn.go, line 449 [r12] (raw file):

// considered finalized and cannot be used to send any more commands.
func (txn *Txn) Rollback() error {
  log.VTracef(2, txn.Context, "rolling back Txn")

s/Txn/transaction/


internal/client/txn.go, line 560 [r12] (raw file):

      err = fn(txn, &opt)

      // Until 7881 is fixed.

TODO


internal/client/txn.go, line 569 [r12] (raw file):

          // fn succeeded, but didn't commit.
          err = txn.Commit()
          log.Tracef(txn.Context, "client.Txn did AutoCommit. err: %s\ntxn: %+v", err, txn.Proto)

TODO


sql/executor.go, line 1069 [r7] (raw file):

      case commit:
          // We're done with this txn.
          txnState.resetStateAndTxn(NoTxn)

This would always happen before finishSQLTxn()?


sql/executor.go, line 527 [r12] (raw file):

          results, remainingStmts, err = runTxnAttempt(e, planMaker, origState, txnState, opt, stmtsToExec)

          // Until #7881 fixed.

TODO


sql/executor.go, line 551 [r12] (raw file):

                  err, txnState.txn.Proto, execOpt.AutoRetry, execOpt.AutoCommit, stmts,
                  remainingStmts)
              // TODO(andrei): Until #7881 fixed.

Is this log message above part of 7881? Because it looks pretty messy (I guess that's ok for now though if it helps).


sql/session.go, line 162 [r1] (raw file):

Previously, andreimatei (Andrei Matei) wrote…

Done.

I don't understand this comment. As far as I can tell, `Ctx` is the base context to be used for anything that goes on in this transaction. Why doesn't it contain `sp` below (which then wouldn't have to be stored separately)?

sql/session.go, line 55 [r3] (raw file):

Previously, andreimatei (Andrei Matei) wrote…

Radu also said something about using more generic names, but I feel like nothing done here is generic, and it all needs to go away, in favor of a more general method of "forced sampling" that works with the global singleton tracer we should be using.

While I'm also not fond of this special-cased code, it's hidden behind an env var and can hopefully go away soon, and the explicit names make sure in the meantime that nobody confuses this for anything it isn't.

sql/session.go, line 67 [r7] (raw file):

context is the Session's base context, used by Ctx().


sql/session.go, line 193 [r7] (raw file):

  sp opentracing.Span
  // When "SQL tracing" is enabled, CollectedSpans accumulates spans as they're

Perhaps you can give a more direct reference to what SQL tracing is.


sql/session.go, line 234 [r7] (raw file):

      // Create a root span for this SQL txn.
      tracer := tracing.TracerFromCtx(ctx)
      ts.sp = tracer.StartSpan(fmt.Sprintf("sql txn for user: %q", s.User))

just "sql transaction" sounds more idiomatic. This is supposed to be a "component" of the system; filtering by user would be done through tags (but I don't think the API/frontend for that is quite there yet).


util/log/structured.go, line 63 [r2] (raw file):

Previously, RaduBerinde wrote…

What's annoying is that the file/line may or may not show up depending how it makes it way into the event. E.g. if you use a VEventf, it depends on the verbosity whether the file/line shows up. But better to sometimes have more information than be consistent and never have it.

There might be a small performance cost associated with it, but I doubt we're going to care about that order of magnitude for a while and saving a few greps a day should be worth it for the time being.

Comments from Reviewable

@tbg
Copy link
Copy Markdown
Member

tbg commented Sep 9, 2016

This isn't clear from the PR description, but what this PR accomplishes is actually quite useful: It'll allow us to trace complete SQL transactions in lightstep (where now the parts of the transaction which go through TxnCoordSender aren't part of a common root span, and thus aren't related to each other). This should be useful for getting a better idea of where transactions are spending their time, and how much they retry, and where.

@andreimatei
Copy link
Copy Markdown
Contributor Author

edited the PR description


Review status: all files reviewed at latest revision, 20 unresolved discussions, all commit checks successful.


internal/client/txn.go, line 449 [r12] (raw file):

Previously, tschottdorf (Tobias Schottdorf) wrote…

s/Txn/transaction/

Done.

internal/client/txn.go, line 560 [r12] (raw file):

Previously, tschottdorf (Tobias Schottdorf) wrote…

TODO

Done.

internal/client/txn.go, line 569 [r12] (raw file):

Previously, tschottdorf (Tobias Schottdorf) wrote…

TODO

I think this can stay

sql/executor.go, line 1069 [r7] (raw file):

Previously, tschottdorf (Tobias Schottdorf) wrote…

This would always happen before finishSQLTxn()?

well, this or a rollback. `finishSQLTxn()` is called when the something moves the TxnState to the NO_TXN state.

We have a problem with not-so-well-behaved clients that don't end their transactions properly. Similar to what I've found yesterday, about closing of pg connections that didn't properly roll back.
I've moved that fix to this PR, in a new commit. Now session.Finish properly cleans up in that case, including closing the span.


sql/executor.go, line 527 [r9] (raw file):

Previously, tschottdorf (Tobias Schottdorf) wrote…

TODO(andrei) here and elsewhere

Done.

sql/executor.go, line 551 [r12] (raw file):

Previously, tschottdorf (Tobias Schottdorf) wrote…

Is this log message above part of 7881? Because it looks pretty messy (I guess that's ok for now though if it helps).

yup

sql/session.go, line 162 [r1] (raw file):

Previously, tschottdorf (Tobias Schottdorf) wrote…

I don't understand this comment. As far as I can tell, Ctx is the base context to be used for anything that goes on in this transaction. Why doesn't it contain sp below (which then wouldn't have to be stored separately)?

reworded comment. And removed `sp`.

sql/session.go, line 67 [r7] (raw file):

Previously, tschottdorf (Tobias Schottdorf) wrote…

context is the Session's base context, used by Ctx().

meh, I like my form better

sql/session.go, line 193 [r7] (raw file):

Previously, tschottdorf (Tobias Schottdorf) wrote…

Perhaps you can give a more direct reference to what SQL tracing is.

replaced with the env var name

sql/session.go, line 234 [r7] (raw file):

Previously, tschottdorf (Tobias Schottdorf) wrote…

just "sql transaction" sounds more idiomatic. This is supposed to be a "component" of the system; filtering by user would be done through tags (but I don't think the API/frontend for that is quite there yet).

Done.

Comments from Reviewable

@tbg
Copy link
Copy Markdown
Member

tbg commented Sep 9, 2016

Reviewed 6 of 6 files at r15, 1 of 1 files at r16.
Review status: 2 of 8 files reviewed at latest revision, 20 unresolved discussions, all commit checks successful.


sql/session.go, line 67 [r7] (raw file):

Previously, andreimatei (Andrei Matei) wrote…

meh, I like my form better

Ambiguous phrases like "You probably don't want to" aren't really fit for usage in code. At least say "Don't use it directly; see Ctx()`.

sql/session.go, line 130 [r16] (raw file):

  // If we're inside a txn, roll it back.
  if s.TxnState.State != NoTxn && s.TxnState.State != Aborted {
      s.TxnState.updateStateAndCleanupOnErr(

won't this put us in NoTxn state so that we don't call finishSQLTxn below even though we should?


sql/session.go, line 138 [r16] (raw file):

  // This will stop the heartbeating of the of the txn record.
  // TODO(andrei): This shouldn't have any effect, since, if there was a
  // transaction, we just explicitly rolled it back above, so the hearbeat loop

s/hearbeat/heartbeat/


Comments from Reviewable

@petermattis
Copy link
Copy Markdown
Collaborator

:lgtm:


Review status: 2 of 8 files reviewed at latest revision, 15 unresolved discussions, some commit checks failed.


sql/executor.go, line 520 [r3] (raw file):

Previously, andreimatei (Andrei Matei) wrote…

Done.

I don't see anything changed here.

util/tracing/tracer.go, line 94 [r3] (raw file):

Previously, andreimatei (Andrei Matei) wrote…

I had it in sql initially, but it uses the defaultOptions which isn't public... So I'd just keep it in here. I do intend for it to go away... Added the TODO to remove.

You could export `defaultOptions`. Guess I don't feel strongly either way.

Comments from Reviewable

@tbg
Copy link
Copy Markdown
Member

tbg commented Sep 10, 2016

:lgtm: only nits.


Reviewed 2 of 2 files at r17, 4 of 4 files at r18.
Review status: all files reviewed at latest revision, 12 unresolved discussions, some commit checks failed.


internal/client/txn.go, line 569 [r18] (raw file):

          // fn succeeded, but didn't commit.
          err = txn.Commit()
          log.Tracef(txn.Context, "client.Txn did AutoCommit. err: %s\ntxn: %+v", err, txn.Proto)

use %v for the error as it can be nil.


sql/executor.go, line 551 [r12] (raw file):

Previously, andreimatei (Andrei Matei) wrote…

yup

Then the TODO should be above. You can put both statements in a block. Also, the error should be logged with %s (and `aErr`, not `err`).

sql/session.go, line 56 [r18] (raw file):

// span baggage key used for marking a span
var keyFor7881Sample = "found#7881"

s/var/const/


Comments from Reviewable

@andreimatei
Copy link
Copy Markdown
Contributor Author

TFTR guys!


Review status: 2 of 8 files reviewed at latest revision, 12 unresolved discussions.


internal/client/txn.go, line 569 [r18] (raw file):

Previously, tschottdorf (Tobias Schottdorf) wrote…

use %v for the error as it can be nil.

Done.

sql/executor.go, line 551 [r12] (raw file):

Previously, tschottdorf (Tobias Schottdorf) wrote…

Then the TODO should be above. You can put both statements in a block. Also, the error should be logged with %s (and aErr, not err).

Done.

sql/session.go, line 67 [r7] (raw file):

Previously, tschottdorf (Tobias Schottdorf) wrote…

Ambiguous phrases like "You probably don't want to" aren't really fit for usage in code. At least say "Don't use it directly; see Ctx()`.

well except sometimes you do want to use`s.context`, if you know what you're doing. The phrase means pay attention.

sql/session.go, line 130 [r16] (raw file):

Previously, tschottdorf (Tobias Schottdorf) wrote…

won't this put us in NoTxn state so that we don't call finishSQLTxn below even though we should?

no, it's gonna put us in `Aborted` state.

sql/session.go, line 138 [r16] (raw file):

Previously, tschottdorf (Tobias Schottdorf) wrote…

s/hearbeat/heartbeat/

Done.

sql/session.go, line 56 [r18] (raw file):

Previously, tschottdorf (Tobias Schottdorf) wrote…

s/var/const/

Done.

Comments from Reviewable

@andreimatei
Copy link
Copy Markdown
Contributor Author

Review status: 2 of 8 files reviewed at latest revision, 12 unresolved discussions, some commit checks pending.


sql/executor.go, line 520 [r3] (raw file):

Previously, petermattis (Peter Mattis) wrote…

I don't see anything changed here.

Done.

Comments from Reviewable

@tbg
Copy link
Copy Markdown
Member

tbg commented Sep 10, 2016

Reviewed 6 of 6 files at r19, 2 of 2 files at r21, 4 of 4 files at r22.
Review status: all files reviewed at latest revision, 7 unresolved discussions, all commit checks successful.


internal/client/txn.go, line 569 [r18] (raw file):

Previously, andreimatei (Andrei Matei) wrote…

Done.

`%v`.

sql/session.go, line 67 [r7] (raw file):

Previously, andreimatei (Andrei Matei) wrote…

well except sometimes you do want to use s.context, if you know what you're doing. The phrase means pay attention.

I'm not even going to entertain that argument. May I remind you that you're defending a comment that reads "You probably don't want to use it directly". BTW Tamir thinks this whole comment is useless (including the reference to `Ctx()`) and that the relevant comment is already on `Ctx()`.

Comments from Reviewable

The TxnState matches the lifetime of a SQL txn, whereas the client.Txn
goes away when the SQL txn goes, for example, to the Aborted state.
Plus, the SQL executor has direct access to a txnState, so it's natural
for it to use the TxnState's context for log messages, and we want these
messages aassocited with the "transaction's trace". In fact, the
TxnState was already the the holder of a txn's root span, so this is in
effect unifying the owner of the span and of all the collected children
spans.

This commit also makes all SQL transactions be executed with a root
span.
At the very least, we have to close the transaction's root span.
We also do more cleanup and attempt to rollback the txn. This is nice
because it lets the rollback attempt be done in the context of the txn.
It supersedes the aborting that was being done by the TxnCoordSender
through the cancellation of the session's context with interrupted the
txn's heartbeat loop.
This only affects messages coming from log.Infof(). Messages going
directly to log.Trace(), or messages going through log.VTrace() but with
the verbosity turned off are not affected and will continue to not have
file/line.
I've decided to go medieval and invest in tracing all SQL execution in
the hope that cockroachdb#7881 will be triggered again. This builds on the prev
commit that added the ability to collect spans across the executor and a
client.Txn.
The tracing is gated by an env var. When turned on, every SQL txn
creates a new tracer that accumulates spans in the session's txnState
(this is similar to how "SQL tracing" currently works). When we detect
7881, we mark the root span of the current one of these traces for
"sampling", which means that later, when that span (and hence the trace)
is closed, we dump the trace with all the log messages in it (note that
currently this trace only has one span, since we're not very good at
starting child spans yet).
@andreimatei
Copy link
Copy Markdown
Contributor Author

Review status: 2 of 8 files reviewed at latest revision, 8 unresolved discussions.


internal/client/txn.go, line 569 [r18] (raw file):

Previously, tschottdorf (Tobias Schottdorf) wrote…

%v.

Done.

sql/session.go, line 67 [r7] (raw file):

Previously, tschottdorf (Tobias Schottdorf) wrote…

I'm not even going to entertain that argument. May I remind you that you're defending a comment that reads "You probably don't want to use it directly".
BTW Tamir thinks this whole comment is useless (including the reference to Ctx()) and that the relevant comment is already on Ctx().

Done.

Comments from Reviewable

@tbg
Copy link
Copy Markdown
Member

tbg commented Sep 10, 2016

Reviewed 6 of 6 files at r23, 2 of 2 files at r25, 4 of 4 files at r26.
Review status: all files reviewed at latest revision, 5 unresolved discussions, some commit checks pending.


Comments from Reviewable

@andreimatei andreimatei merged commit 860d504 into cockroachdb:master Sep 10, 2016
@andreimatei andreimatei deleted the fix-autocommit branch September 10, 2016 21:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants