-
Notifications
You must be signed in to change notification settings - Fork 4.1k
SQLState(08006) error causing HTTP 500 error upstream #31645
Description
Describe the problem
While conducting a routine migration, customer encountered the following:
- Context canceled / aborted transactions in the logs, for example:
W181019 15:24:49.629554 61831020 internal/client/txn.go:556 [n2,client=130.211.2.195:59869,user=foo] failure aborting transaction: HandledRetryableTxnError: TransactionAbortedError: txn aborted "sql txn" id=581b6de4 key=/Table/82/1/"\xe4\x15\xd4\xe7\xab\xc4G\x1d\x9c5\xd99\x98\xab\xe2\xf5"/0 rw=true pri=0.03491282 iso=SERIALIZABLE stat=PENDING epo=0 ts=1539962689.619304580,0 orig=1539962689.619304580,0 max=1539962690.119304580,0 wto=false rop=false seq=2; abort caused by: failed to send RPC: sending to all 3 replicas failed; last error: {<nil> context canceled} I181019 15:24:56.600105 181 gossip/gossip.go:488 [n2] gossip status (ok, 3 nodes) - Upstream SQLSTATE(08006) errors in HikariCP:
WARN c.z.hikari.pool.ProxyConnection - roach - Connection org.postgresql.jdbc.PgConnection@594e7e23 marked as broken because of SQLSTATE(08006), ErrorCode(0) - HTTP 500 errors in their application due to the broken connection & 08006 error.
Looking in our code, I see code 08006 is the result of a CodeConnectionFailureError. These are called only twice: in schema_changer.go and inbound.go:99
The comment above that line does seem to describe the condition described in the log warning. However, it's not clear why the result of this race is a 08006 error since that would usually indicate a problem with the connection while (at least as far as I can see) the underlying issue is just context cancellation.
To Reproduce
This doesn't reproduce easily, but did result in 4 errors today for the customer during routine migrations.
Environment:
Postgres JDBC 42.1.3
Hibernate 5.1.8.Final
HikariCP 3.2.0
CRDB version to be provided.
Additional context
What was the impact?
HTTP 500 errors potentially sent to end users.
@vivekmenezes, assigning to you first since @andreimatei indicated you wrote this portion of the code. If there's a better home let me know.