Skip to content

sql: do not print stack trace when logging if txn is not open#91160

Merged
craig[bot] merged 1 commit intocockroachdb:masterfrom
yuzefovich:fix-exec-log-aborted-txn
Nov 14, 2022
Merged

sql: do not print stack trace when logging if txn is not open#91160
craig[bot] merged 1 commit intocockroachdb:masterfrom
yuzefovich:fix-exec-log-aborted-txn

Conversation

@yuzefovich
Copy link
Copy Markdown
Member

@yuzefovich yuzefovich commented Nov 2, 2022

After executing each statement, that statement might be logged. If there
were any audit events, then we attempt to resolve the table names for
which the audit events have occurred. To do the resolution we're using
the current txn. Previously, if that txn has been aborted or committed,
it would result in a scary-looking stack trace added to the log, and
this commit fixes it.

Epic: None

Release note: None

@yuzefovich yuzefovich requested review from a team and knz November 2, 2022 20:57
@cockroach-teamcity
Copy link
Copy Markdown
Member

This change is Reviewable

@yuzefovich
Copy link
Copy Markdown
Member Author

Here is an example I saw in one of the tickets (somewhat redacted):

E221020 18:48:42.960063 1595750707 kv/kvclient/kvcoord/txn_coord_sender.go:708 ⋮ [...] 2914633 ‹client already committed or rolled back the transaction. Trying to execute: 1 Get›. stack:
‹goroutine 1595750707 [running]:›
‹runtime/debug.Stack()›
‹ GOROOT/src/runtime/debug/stack.go:24 +0x65›
‹github.com/cockroachdb/cockroach/pkg/kv/kvclient/kvcoord.(*TxnCoordSender).maybeRejectClientLocked(0xc040e2a580, {0x63b1f10, 0xc0464fb7c0}, 0xc076686b08)›
‹ github.com/cockroachdb/cockroach/pkg/kv/kvclient/kvcoord/txn_coord_sender.go:707 +0x2ae›
‹github.com/cockroachdb/cockroach/pkg/kv/kvclient/kvcoord.(*TxnCoordSender).Send(_, {_, _}, {{{0x0, 0x0, 0x0}, 0x0, {0x0, 0x0, 0x0, ...}, ...}, ...})›
‹ github.com/cockroachdb/cockroach/pkg/kv/kvclient/kvcoord/txn_coord_sender.go:492 +0xf1›
‹github.com/cockroachdb/cockroach/pkg/kv.(*DB).sendUsingSender(_, {_, _}, {{{0x0, 0x0, 0x0}, 0x0, {0x0, 0x0, 0x0, ...}, ...}, ...}, ...)›
‹ github.com/cockroachdb/cockroach/pkg/kv/db.go:985 +0x14d›
‹github.com/cockroachdb/cockroach/pkg/kv.(*Txn).Send(_, {_, _}, {{{0x0, 0x0, 0x0}, 0x0, {0x0, 0x0, 0x0, ...}, ...}, ...})›
‹ github.com/cockroachdb/cockroach/pkg/kv/txn.go:1124 +0x21d›
‹github.com/cockroachdb/cockroach/pkg/kv.sendAndFill({0x63b1f10, 0xc0464fb7c0}, 0xc0464fb7c0, 0xc0917c0b00)›
‹ github.com/cockroachdb/cockroach/pkg/kv/db.go:830 +0xfc›
‹github.com/cockroachdb/cockroach/pkg/kv.(*Txn).Run(0xc03050d6b0, {0x63b1f10, 0xc0464fb7c0}, 0x0)›
‹ github.com/cockroachdb/cockroach/pkg/kv/txn.go:696 +0x74›
‹github.com/cockroachdb/cockroach/pkg/sql/catalog/internal/catkv.catalogQuerier.query({0x1, {0x4ebf805, 0x3}, {{0xc00002e660}, {0xc00002e660}, 0x0}}, {0x63b1f10, 0xc0464fb7c0}, 0x459769, 0xc0766874b8)›
‹ github.com/cockroachdb/cockroach/pkg/sql/catalog/internal/catkv/catalog_query.go:112 +0xdd›
‹github.com/cockroachdb/cockroach/pkg/sql/catalog/internal/catkv.lookupDescriptorsUnvalidated({0x63b1f10, 0xc0464fb7c0}, 0x459640, {0x1, {0x4ebf805, 0x3}, {{0xc00002e660}, {0xc00002e660}, 0x0}}, {0xc025f3da60, ...})›
‹ github.com/cockroachdb/cockroach/pkg/sql/catalog/internal/catkv/catalog_query.go:41 +0x145›
‹github.com/cockroachdb/cockroach/pkg/sql/catalog/internal/catkv.lookupDescriptorsAndValidate({0x63b1f10, 0xc0464fb7c0}, {{0x8a28a6b8, 0x7f9f, 0x10, 0x0}}, 0xc03050d6b0, {0x1, {0x4ebf805, 0x3}, ...}, ...)›
‹ github.com/cockroachdb/cockroach/pkg/sql/catalog/internal/catkv/descriptors.go:58 +0xef›
‹github.com/cockroachdb/cockroach/pkg/sql/catalog/internal/catkv.MustGetDescriptorsByID({0x63b1f10, 0xc0464fb7c0}, {{0x32ebd698, 0xc0, 0x76687728, 0xc0}}, {{0xc00002e660}, {0xc00002e660}, 0x0}, 0xc03050d6b0, ...)›
‹ github.com/cockroachdb/cockroach/pkg/sql/catalog/internal/catkv/descriptors.go:164 +0x11c›
‹github.com/cockroachdb/cockroach/pkg/sql/catalog/descs.(*kvDescriptors).getByIDs(0xc01de17610, {0x63b1f10, 0xc0464fb7c0}, {{0x32ebd698, 0xc0, 0x766877d0, 0xc0}}, 0xc076687830, {0x632db90, 0xc0414af1b0}, ...)›
‹ github.com/cockroachdb/cockroach/pkg/sql/catalog/descs/kv_descriptors.go:253 +0x3c7›
‹github.com/cockroachdb/cockroach/pkg/sql/catalog/descs.(*Collection).getDescriptorsByID.func2()›
‹ github.com/cockroachdb/cockroach/pkg/sql/catalog/descs/descriptor.go:161 +0x405›
‹github.com/cockroachdb/cockroach/pkg/sql/catalog/descs.(*Collection).withReadFromStore(0xc01de175a8, 0x0, 0x1)›
‹ github.com/cockroachdb/cockroach/pkg/sql/catalog/descs/descriptor.go:374 +0x29›
‹github.com/cockroachdb/cockroach/pkg/sql/catalog/descs.(*Collection).getDescriptorsByID(0xc01de175a8, {0x63b1f10, 0xc0464fb7c0}, 0xc03050d6b0, {0x1, 0x0, 0x1, 0x1, 0x1, 0x0}, ...)›
‹ github.com/cockroachdb/cockroach/pkg/sql/catalog/descs/descriptor.go:141 +0x5b0›
‹github.com/cockroachdb/cockroach/pkg/sql/catalog/descs.(*Collection).getDatabaseByID(0xc037e718c0, {0x63b1f10, 0xc0464fb7c0}, 0x1843029, 0x3bb, {0x1, 0x0, 0x1, 0x1, 0x1, ...})›
‹ github.com/cockroachdb/cockroach/pkg/sql/catalog/descs/database.go:101 +0x90›
‹github.com/cockroachdb/cockroach/pkg/sql/catalog/descs.(*Collection).GetImmutableDatabaseByID(0x39, {0x63b1f10, 0xc0464fb7c0}, 0x0, 0x92492492, {0x1, 0x0, 0x1, 0x1, 0x1, ...})›
‹ github.com/cockroachdb/cockroach/pkg/sql/catalog/descs/database.go:95 +0x3b›
‹github.com/cockroachdb/cockroach/pkg/sql.(*planner).getQualifiedTableName(0xc01de179f0, {0x63b1f10, 0xc0464fb7c0}, {0x64dea58, 0xc018571900})›
‹ github.com/cockroachdb/cockroach/pkg/sql/resolver.go:727 +0x9e›
‹github.com/cockroachdb/cockroach/pkg/sql.(*planner).maybeLogStatementInternal(0xc01de179f0, {0x63b1f10, 0xc0464fb7c0}, 0x0, 0x42, 0x0, 0x961, 0x1, {0x0, 0x0}, ...)›
‹ github.com/cockroachdb/cockroach/pkg/sql/exec_log.go:331 +0x2176›
‹github.com/cockroachdb/cockroach/pkg/sql.(*planner).maybeLogStatement(0x0, {0x63b1f10, 0xc0464fb7c0}, 0x320ad96, 0xf0, 0x316d094, 0xc001bb20c0, 0x0, {0x0, 0x0}, ...)›
‹ github.com/cockroachdb/cockroach/pkg/sql/exec_log.go:165 +0xa5›
‹github.com/cockroachdb/cockroach/pkg/sql.(*connExecutor).dispatchToExecutionEngine.func1()›
‹ github.com/cockroachdb/cockroach/pkg/sql/conn_executor_exec.go:1072 +0x19a›
‹github.com/cockroachdb/cockroach/pkg/sql.(*connExecutor).dispatchToExecutionEngine(0xc01de17300, {0x63b1f10, 0xc0464fb7c0}, 0xc01de179f0, {0x7f9f55676f30, 0xc06087a8d0})›
‹ github.com/cockroachdb/cockroach/pkg/sql/conn_executor_exec.go:1205 +0xf0f›
‹github.com/cockroachdb/cockroach/pkg/sql.(*connExecutor).execStmtInOpenState(0xc01de17300, {0x63b1fb8, 0xc02b563860}, {{0x63e3738, 0xc0720c4700}, {0xc01e888b1f, 0x58}, 0x3, 0x1}, 0xc068efe900, ...)›
‹ github.com/cockroachdb/cockroach/pkg/sql/conn_executor_exec.go:699 +0x2251›
‹github.com/cockroachdb/cockroach/pkg/sql.(*connExecutor).execStmt(0xc01de17300, {0x63b1fb8, 0xc02b563860}, {{0x63e3738, 0xc0720c4700}, {0xc01e888b1f, 0x58}, 0x3, 0x1}, 0xc068efe900, ...)›
‹ github.com/cockroachdb/cockroach/pkg/sql/conn_executor_exec.go:144 +0x59e›
‹github.com/cockroachdb/cockroach/pkg/sql.(*connExecutor).execPortal(0xc01de17300, {0x63b1fb8, 0xc02b563860}, {0xc068efe900, {0xc06087a840, 0x3, 0x3}, {0x9bd7d5c, 0x1, 0x1}, ...}, ...)›
‹ github.com/cockroachdb/cockroach/pkg/sql/conn_executor_exec.go:232 +0x16e›
‹github.com/cockroachdb/cockroach/pkg/sql.(*connExecutor).execCmd.func2({{0xc017fc60a2, 0x0}, 0x1, {0x3987b210, 0xedae38ef7, 0x0}, 0x1}, 0xc01de17300, 0xc076689980, 0xc076689970, ...)›
‹ github.com/cockroachdb/cockroach/pkg/sql/conn_executor.go:1969 +0x552›
‹github.com/cockroachdb/cockroach/pkg/sql.(*connExecutor).execCmd(0xc01de17300)›
‹ github.com/cockroachdb/cockroach/pkg/sql/conn_executor.go:1971 +0x41b›
‹github.com/cockroachdb/cockroach/pkg/sql.(*connExecutor).run(0xc01de17300, {0x63b1f10, 0xc01cab5000}, 0xc040ef9dd0, {0x5400, 0x15000, 0xc001937140, 0x0, 0x0}, 0xc0049734b0)›
‹ github.com/cockroachdb/cockroach/pkg/sql/conn_executor.go:1817 +0x278›
‹github.com/cockroachdb/cockroach/pkg/sql.(*Server).ServeConn(0x63b1f10, {0x63b1f10, 0xc01cab5000}, {0x3}, {0x5400, 0x15000, 0xc001937140, 0x0, 0x0}, 0xc0049734b0)›
‹ github.com/cockroachdb/cockroach/pkg/sql/conn_executor.go:763 +0xf7›
‹github.com/cockroachdb/cockroach/pkg/sql/pgwire.(*conn).processCommandsAsync.func1()›
‹ github.com/cockroachdb/cockroach/pkg/sql/pgwire/conn.go:728 +0x415›
‹created by github.com/cockroachdb/cockroach/pkg/sql/pgwire.(*conn).processCommandsAsync›
‹ github.com/cockroachdb/cockroach/pkg/sql/pgwire/conn.go:639 +0x273›
W221020 18:48:42.960568 1595750707 sql/exec_log.go:333 ⋮ [...] 2914634 name for audited table ID ... not found: ‹TransactionStatusError: client already committed or rolled back the transaction. Trying to execute: 1 Get (REASON_TXN_COMMITTED)›

Copy link
Copy Markdown
Contributor

@knz knz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice find 💯

this will benefit from a backport

Copy link
Copy Markdown
Contributor

@andreimatei andreimatei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any chance I can snipe you into figuring out a better story for the relationship between a txn and resolving names? I no longer know what I'm talking about, but it has always bothered me that resolving names is tied to the kv txn needing the resolving. Resolving names should be independent of any kv txn; I think it should only need a timestamp and a descriptor collection (or whatever has to do with the state of dirty descriptors modified in the current SQL txn). I feel like this has come up many times and we keep patching over it, and if I understand correctly, this here is one more bandaid.
cc @ajwerner

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @yuzefovich)


pkg/sql/exec_log.go line 337 at r1 (raw file):

			// extract a name to include in the logging events.
			if p.txn != nil && p.txn.IsAborted() {
				// Do not attempt to resolve the table name if the txn is

nit: cut the embellishment from this comment; say that you can't use an aborted txn

Copy link
Copy Markdown
Contributor

@ajwerner ajwerner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, in this case, the collection doesn't have a modified copy of this table and doesn't have a lease/couldn't get one. The descs.Collection is tied implicitly to a transaction. I think if you want to resolve a name outside the context of a transaction, you should have to create a new transaction and a new collection. There are tools to do that.

An alternative is to fail if we fail to find the table in the lease manager if the transaction we have a handle to is not open. Maybe that'd be better. My guess is that the table in question was dropped.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @andreimatei and @yuzefovich)

Copy link
Copy Markdown
Contributor

@ajwerner ajwerner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll revise the above assessment to say that the database was dropped.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @andreimatei and @yuzefovich)

@yuzefovich yuzefovich force-pushed the fix-exec-log-aborted-txn branch from e793fa6 to ba16ab8 Compare November 3, 2022 15:14
@yuzefovich yuzefovich changed the title sql: do not print stack trace when logging if txn is already aborted sql: do not print stack trace when logging if txn is not open Nov 3, 2022
Copy link
Copy Markdown
Member Author

@yuzefovich yuzefovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I initially misread the stack trace - note REASON_TXN_COMMITTED, so we can only use the txn for the resolution if it is still in an open state. PTAL.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @andreimatei)

@yuzefovich yuzefovich force-pushed the fix-exec-log-aborted-txn branch from ba16ab8 to 2242e68 Compare November 3, 2022 15:16
After executing each statement, that statement might be logged. If there
were any audit events, then we attempt to resolve the table names for
which the audit events have occurred. To do the resolution we're using
the current txn. Previously, if that txn has been aborted or committed,
it would result in a scary-looking stack trace added to the log, and
this commit fixes it.

Release note: None
@yuzefovich yuzefovich force-pushed the fix-exec-log-aborted-txn branch from 2242e68 to d9726d2 Compare November 8, 2022 18:13
@yuzefovich
Copy link
Copy Markdown
Member Author

@knz PTAL

Copy link
Copy Markdown
Contributor

@knz knz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks

@yuzefovich
Copy link
Copy Markdown
Member Author

TFTR!

bors r+

@craig
Copy link
Copy Markdown
Contributor

craig bot commented Nov 14, 2022

Build succeeded:

@blathers-crl
Copy link
Copy Markdown

blathers-crl bot commented Nov 14, 2022

Encountered an error creating backports. Some common things that can go wrong:

  1. The backport branch might have already existed.
  2. There was a merge conflict.
  3. The backport branch contained merge commits.

You might need to create your backport manually using the backport tool.


error creating merge commit from d9726d2 to blathers/backport-release-22.1-91160: POST https://api.github.com/repos/cockroachdb/cockroach/merges: 409 Merge conflict []

you may need to manually resolve merge conflicts with the backport tool.

Backport to branch 22.1.x failed. See errors above.


🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants