-
Notifications
You must be signed in to change notification settings - Fork 1.5k
spanner: nil pointer dereference in ReadWriteTransaction.rollback when session handle is nil #14025
Copy link
Copy link
Closed
Labels
api: spannerIssues related to the Spanner API.Issues related to the Spanner API.priority: p1Important issue which blocks shipping the next release. Will be fixed prior to next release.Important issue which blocks shipping the next release. Will be fixed prior to next release.
Description
Client
Spanner
Environment
Linux (Debian-based, amd64) on GCE
go version go1.25.0
cloud.google.com/go/spanner v1.88.0
github.com/googleapis/go-sql-spanner v1.24.0
Code and Dependencies
This panic seems to occur when a transaction commit is aborted by Spanner (due to a conflict, for example) and the retry fails because the context passed into the transaction has been cancelled.
Expected behavior
When a transaction commit is aborted and the retry fails, an error should be returned.
Actual behavior
cloud.google.com/go/spanner panics as such:
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x10 pc=0x103318a]
goroutine 3024 [running]:
sync/atomic.(*Int32).Add(...)
/usr/local/go/src/sync/atomic/type.go:94
sync.(*RWMutex).RLock(...)
/usr/local/go/src/sync/rwmutex.go:72
cloud.google.com/go/spanner.(*sessionHandle).getID(0x10468d5?)
.../cloud.google.com/go/spanner@v1.88.0/session.go:108 +0x2a
cloud.google.com/go/spanner.(*ReadWriteTransaction).rollback(0x1c04b179f340, {0x1f61a58, 0x3351180})
.../cloud.google.com/go/spanner@v1.88.0/transaction.go:1940 +0xbf
cloud.google.com/go/spanner.(*ReadWriteStmtBasedTransaction).Rollback(0x1c04b179f340, {0x1f61a58?, 0x3351180?})
.../cloud.google.com/go/spanner@v1.88.0/transaction.go:2156 +0x25
github.com/googleapis/go-sql-spanner.(*readWriteTransaction).rollback(0x1c04cd640380)
.../github.com/googleapis/go-sql-spanner@v1.24.0/transaction.go:609 +0x32
github.com/googleapis/go-sql-spanner.(*readWriteTransaction).Rollback(0x1c04cd640380)
.../github.com/googleapis/go-sql-spanner@v1.24.0/transaction.go:602 +0xc5
github.com/googleapis/go-sql-spanner.(*delegatingTransaction).Rollback(0x0?)
.../github.com/googleapis/go-sql-spanner@v1.24.0/transaction.go:159 +0x22
github.com/googleapis/go-sql-spanner.(*conn).Rollback(0x0?, {0x0?, 0x0?})
.../github.com/googleapis/go-sql-spanner@v1.24.0/conn.go:1757 +0x1c
github.com/googleapis/go-sql-spanner.(*conn).Close(0x1c04cd604680)
.../github.com/googleapis/go-sql-spanner@v1.24.0/conn.go:1368 +0x2d
database/sql.(*driverConn).finalClose.func2()
/usr/local/go/src/database/sql/sql.go:695 +0x32
database/sql.withLock({0x1f584b0, 0x1c04cd640280}, 0x1c04cc3cb950)
/usr/local/go/src/database/sql/sql.go:3572 +0x71
database/sql.(*driverConn).finalClose(0x1c04cd640280)
/usr/local/go/src/database/sql/sql.go:693 +0xe5
database/sql.(*driverConn).Close(0x1c04cd640280)
/usr/local/go/src/database/sql/sql.go:674 +0x138
database/sql.(*DB).putConn(0x1c04a1fa21a0, 0x1c04cd640280, {0x0, 0x0}, 0x1)
/usr/local/go/src/database/sql/sql.go:1518 +0x333
database/sql.(*driverConn).releaseConn(...)
/usr/local/go/src/database/sql/sql.go:578
database/sql.(*Conn).close(0x1c04cc3cbaa8?, {0x0, 0x0})
/usr/local/go/src/database/sql/sql.go:2141 +0xc5
database/sql.(*Conn).Close(...)
/usr/local/go/src/database/sql/sql.go:2153
github.com/googleapis/go-sql-spanner.runTransactionWithOptions.func1()
.../github.com/googleapis/go-sql-spanner@v1.24.0/driver.go:1122 +0x1c
github.com/googleapis/go-sql-spanner.runTransactionWithOptions(...)
.../github.com/googleapis/go-sql-spanner@v1.24.0/driver.go:1194 +0x42c
Additional context
It seems like the issue is that t.sh can be nil in the following check:
sid, client := t.sh.getID(), t.sh.getClient()
The following is an AI-generated description of the issue and a proposed fix, if it's helpful:
Root cause
ReadWriteTransaction.rollback at https://github.com/googleapis/google-cloud-go/blob/main/spanner/transaction.go#L1940 dereferences t.sh without a nil check:
func (t *ReadWriteTransaction) rollback(ctx context.Context) {
t.mu.Lock()
t.state = txClosed
if t.tx == nil {
t.mu.Unlock()
return
}
t.mu.Unlock()
// In case that sessionHandle was destroyed but transaction body fails to
// report it.
sid, client := t.sh.getID(), t.sh.getClient() // panics when t.sh is nil
// ...
}
The comment acknowledges the session handle can be destroyed, but the code only checks the values returned by t.sh.getID() and t.sh.getClient() — it doesn't
check whether t.sh itself is nil.
How t.sh becomes nil
ReadWriteStmtBasedTransaction.CommitWithReturnResp unconditionally recycles and nils the session handle, including for Aborted errors:
// transaction.go:2148-2153
if t.sh != nil {
t.sh.recycle()
t.sh = nil
}
return resp, err
When the go-sql-spanner driver retries an aborted transaction via resetTransactionForRetry, it calls ReadWriteStmtBasedTransaction.ResetForRetry, which passes
the now-nil t.sh to newReadWriteStmtBasedTransactionWithSessionHandle. That function tries to acquire a new session, but if the context has been cancelled (e.g.
by errgroup), session acquisition fails and ResetForRetry returns an error. The old ReadWriteStmtBasedTransaction (with sh == nil) remains referenced.
During cleanup, the deferred conn.Close() in runTransactionWithOptions sees the connection is still in a transaction (because resetTransactionForRetry restored
c.tx before the retry failed) and calls Rollback, which reaches ReadWriteTransaction.rollback with a nil t.sh.
Suggested fix
Add a nil check for t.sh in ReadWriteTransaction.rollback:
func (t *ReadWriteTransaction) rollback(ctx context.Context) {
t.mu.Lock()
t.state = txClosed
if t.tx == nil {
t.mu.Unlock()
return
}
t.mu.Unlock()
+ if t.sh == nil {
+ return
+ }
sid, client := t.sh.getID(), t.sh.getClient()
if sid == "" || client == nil {
return
}
// ...
}
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
api: spannerIssues related to the Spanner API.Issues related to the Spanner API.priority: p1Important issue which blocks shipping the next release. Will be fixed prior to next release.Important issue which blocks shipping the next release. Will be fixed prior to next release.