Skip to content

sql: auto-upgrade causing segfault in ShowClusterSettings #25600

@a-robinson

Description

@a-robinson

Over the last couple days there have been a number of test flakes caused by a sigsegv in some DString code within the ShowClusterSettings sql logic. In all cases, it appears to be tightly correlated with the new auto-upgrade task running during server shutdown.

#25603
#25570
#25567
#25566
#25511
#25485

In addition to all the issues linked above, I also saw this locally while running make testshort PKG=./pkg/server.

I180516 22:21:18.635591 23796 server/server.go:782  [n?] monitoring forward clock jumps based on server.clock.forward_jump_check_enabled
I180516 22:21:18.637910 23796 server/config.go:539  [n?] 1 storage engine initialized
I180516 22:21:18.637921 23796 server/config.go:542  [n?] RocksDB cache size: 128 MiB
I180516 22:21:18.637928 23796 server/config.go:542  [n?] store 0: in-memory, size 0 B
I180516 22:21:18.639794 23796 server/node.go:376  [n?] **** cluster 5294ef96-9010-4d9b-8a5c-5d8df1092d43 has been created
I180516 22:21:18.639828 23796 server/server.go:1356  [n?] **** add additional nodes by specifying --join=127.0.0.1:63817
I180516 22:21:18.640272 23796 storage/store.go:1440  [n1,s1] [n1,s1]: failed initial metrics computation: [n1,s1]: system config not yet available
I180516 22:21:18.640317 23796 server/node.go:506  [n1] initialized store [n1,s1]: disk (capacity=512 MiB, available=512 MiB, used=0 B, logicalBytes=6.9 KiB), ranges=1, leases=0, writes=0.00, bytesPerReplica={p10=7043.00 p25=7043.00 p50=7043.00 p75=7043.00 p90=7043.00 pMax=7043.00}, writesPerReplica={p10=0.00 p25=0.00 p50=0.00 p75=0.00 p90=0.00 pMax=0.00}
I180516 22:21:18.640342 23796 server/node.go:354  [n1] node ID 1 initialized
I180516 22:21:18.640383 23796 gossip/gossip.go:333  [n1] NodeDescriptor set to node_id:1 address:<network_field:"tcp" address_field:"127.0.0.1:63817" > attrs:<> locality:<> ServerVersion:<major_val:2 minor_val:0 patch:0 unstable:3 >
I180516 22:21:18.640431 23796 storage/stores.go:222  [n1] read 0 node addresses from persistent storage
I180516 22:21:18.640494 23796 server/node.go:647  [n1] connecting to gossip network to verify cluster ID...
I180516 22:21:18.640508 23796 server/node.go:672  [n1] node connected via gossip and verified as part of cluster "5294ef96-9010-4d9b-8a5c-5d8df1092d43"
I180516 22:21:18.640521 23796 server/node.go:440  [n1] node=1: started with [<no-attributes>=<in-mem>] engine(s) and attributes []
I180516 22:21:18.640584 23796 server/server.go:1481  [n1] starting https server at 127.0.0.1:63818
I180516 22:21:18.640601 23796 server/server.go:1482  [n1] starting grpc/postgres server at 127.0.0.1:63817
I180516 22:21:18.640616 23796 server/server.go:1483  [n1] advertising CockroachDB node at 127.0.0.1:63817
I180516 22:21:18.642278 24011 storage/replica_command.go:863  [split,n1,s1,r1/1:/M{in-ax}] initiating a split of this range at key /System/"" [r2]
I180516 22:21:18.652670 23771 storage/replica_command.go:863  [split,n1,s1,r2/1:/{System/-Max}] initiating a split of this range at key /System/NodeLiveness [r3]
I180516 22:21:18.656196 23720 storage/replica_command.go:863  [split,n1,s1,r3/1:/{System/NodeL…-Max}] initiating a split of this range at key /System/NodeLivenessMax [r4]
I180516 22:21:18.660069 23775 storage/replica_command.go:863  [split,n1,s1,r4/1:/{System/NodeL…-Max}] initiating a split of this range at key /System/tsd [r5]
I180516 22:21:18.664190 23789 storage/replica_command.go:863  [split,n1,s1,r5/1:/{System/tsd-Max}] initiating a split of this range at key /System/"tse" [r6]
I180516 22:21:18.673742 24055 storage/replica_command.go:863  [split,n1,s1,r6/1:/{System/tse-Max}] initiating a split of this range at key /Table/SystemConfigSpan/Start [r7]
I180516 22:21:18.677631 23791 storage/replica_command.go:863  [split,n1,s1,r7/1:/{Table/System…-Max}] initiating a split of this range at key /Table/11 [r8]
I180516 22:21:18.680980 24098 storage/replica_command.go:863  [split,n1,s1,r8/1:/{Table/11-Max}] initiating a split of this range at key /Table/12 [r9]
I180516 22:21:18.684146 24099 storage/replica_command.go:863  [split,n1,s1,r9/1:/{Table/12-Max}] initiating a split of this range at key /Table/13 [r10]
I180516 22:21:18.687730 24078 storage/replica_command.go:863  [split,n1,s1,r10/1:/{Table/13-Max}] initiating a split of this range at key /Table/14 [r11]
I180516 22:21:18.691022 24039 storage/replica_command.go:863  [split,n1,s1,r11/1:/{Table/14-Max}] initiating a split of this range at key /Table/15 [r12]
I180516 22:21:18.694685 24101 storage/replica_command.go:863  [split,n1,s1,r12/1:/{Table/15-Max}] initiating a split of this range at key /Table/16 [r13]
I180516 22:21:18.698035 24150 storage/replica_command.go:863  [split,n1,s1,r13/1:/{Table/16-Max}] initiating a split of this range at key /Table/17 [r14]
I180516 22:21:18.699968 24049 sql/event_log.go:124  [n1,intExec=optInToDiagnosticsStatReporting] Event: "set_cluster_setting", target: 0, info: {SettingName:diagnostics.reporting.enabled Value:true User:root}
I180516 22:21:18.701715 23805 storage/replica_command.go:863  [split,n1,s1,r14/1:/{Table/17-Max}] initiating a split of this range at key /Table/18 [r15]
I180516 22:21:18.705036 24166 storage/replica_command.go:863  [split,n1,s1,r15/1:/{Table/18-Max}] initiating a split of this range at key /Table/19 [r16]
I180516 22:21:18.709970 24195 storage/replica_command.go:863  [split,n1,s1,r16/1:/{Table/19-Max}] initiating a split of this range at key /Table/20 [r17]
I180516 22:21:18.711537 24120 sql/event_log.go:124  [n1,intExec=set-setting] Event: "set_cluster_setting", target: 0, info: {SettingName:version Value:$1 User:root}
I180516 22:21:18.713148 24212 storage/replica_command.go:863  [split,n1,s1,r17/1:/{Table/20-Max}] initiating a split of this range at key /Table/21 [r18]
I180516 22:21:18.715581 24171 sql/event_log.go:124  [n1,intExec=disableNetTrace] Event: "set_cluster_setting", target: 0, info: {SettingName:trace.debug.enable Value:false User:root}
I180516 22:21:18.716319 24227 storage/replica_command.go:863  [split,n1,s1,r18/1:/{Table/21-Max}] initiating a split of this range at key /Table/22 [r19]
I180516 22:21:18.718802 24103 storage/replica_command.go:863  [split,n1,s1,r19/1:/{Table/22-Max}] initiating a split of this range at key /Table/23 [r20]
I180516 22:21:18.722196 24231 sql/event_log.go:124  [n1,intExec=initializeClusterSecret] Event: "set_cluster_setting", target: 0, info: {SettingName:cluster.secret Value:gen_random_uuid()::STRING User:root}
I180516 22:21:18.723901 23796 server/server.go:1560  [n1] done ensuring all necessary migrations have run
I180516 22:21:18.723917 23796 server/server.go:1563  [n1] serving sql connections
I180516 22:21:18.724191 23796 util/stop/stopper.go:471  quiescing; tasks left:
1      [async] auto-upgrade
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x8 pc=0x4832a05]

goroutine 24105 [running]:
github.com/cockroachdb/cockroach/pkg/sql/sem/tree.(*DString).Size(0x0, 0xc4211cc3d8)
	/Users/alex/go/src/github.com/cockroachdb/cockroach/pkg/sql/sem/tree/datum.go:933 +0x5
github.com/cockroachdb/cockroach/pkg/sql/sqlbase.(*RowContainer).rowSize(0xc420f43c20, 0xc4211cc4b8, 0x1, 0x1, 0x0)
	/Users/alex/go/src/github.com/cockroachdb/cockroach/pkg/sql/sqlbase/row_container.go:247 +0x66
github.com/cockroachdb/cockroach/pkg/sql/sqlbase.(*RowContainer).AddRow(0xc420f43c20, 0x5f9a1e0, 0xc4210296b0, 0xc4211cc4b8, 0x1, 0x1, 0x7, 0x0, 0x0, 0x0, ...)
	/Users/alex/go/src/github.com/cockroachdb/cockroach/pkg/sql/sqlbase/row_container.go:272 +0x7c
github.com/cockroachdb/cockroach/pkg/sql.(*planner).ShowClusterSetting.func1(0x5f9a1e0, 0xc4210296b0, 0xc420b64420, 0x0, 0x0, 0x479b800, 0xc420ca40e8)
	/Users/alex/go/src/github.com/cockroachdb/cockroach/pkg/sql/show_cluster_setting.go:161 +0x186
github.com/cockroachdb/cockroach/pkg/sql.doExpandPlan(0x5f9a1e0, 0xc4210296b0, 0xc420b64420, 0x7fffffffffffffff, 0x0, 0x0, 0x0, 0x0, 0x5f90560, 0xc421cdac80, ...)
	/Users/alex/go/src/github.com/cockroachdb/cockroach/pkg/sql/expand_plan.go:326 +0x2df2
github.com/cockroachdb/cockroach/pkg/sql.(*planner).expandPlan(0xc420b64420, 0x5f9a1e0, 0xc4210296b0, 0x5f90560, 0xc421cdac80, 0x5f90560, 0xc421cdac80, 0x0, 0x0)
	/Users/alex/go/src/github.com/cockroachdb/cockroach/pkg/sql/expand_plan.go:35 +0xce
github.com/cockroachdb/cockroach/pkg/sql.(*planner).optimizePlan(0xc420b64420, 0x5f9a1e0, 0xc4210296b0, 0x5f90560, 0xc421cdac80, 0xc420960fca, 0x1, 0x1, 0x5f90560, 0xc421cdac80, ...)
	/Users/alex/go/src/github.com/cockroachdb/cockroach/pkg/sql/optimize.go:47 +0xeb
github.com/cockroachdb/cockroach/pkg/sql.(*planner).makePlan(0xc420b64420, 0x5f9a1e0, 0xc4210296b0, 0x5f9c820, 0xc42144ec80, 0x0, 0x0, 0x0, 0x0, 0x0, ...)
	/Users/alex/go/src/github.com/cockroachdb/cockroach/pkg/sql/plan.go:325 +0x2ea
github.com/cockroachdb/cockroach/pkg/sql.(*connExecutor).dispatchToExecutionEngine(0xc420b64000, 0x5f9a1e0, 0xc4210296b0, 0x5f9c820, 0xc42144ec80, 0x0, 0x0, 0x0, 0x0, 0x0, ...)
	/Users/alex/go/src/github.com/cockroachdb/cockroach/pkg/sql/conn_executor_exec.go:579 +0xbe9
github.com/cockroachdb/cockroach/pkg/sql.(*connExecutor).execStmtInOpenState(0xc420b64000, 0x5f9a1e0, 0xc4210296b0, 0x5f9c820, 0xc42144ec80, 0x0, 0x0, 0x0, 0x0, 0x0, ...)
	/Users/alex/go/src/github.com/cockroachdb/cockroach/pkg/sql/conn_executor_exec.go:398 +0xb33
github.com/cockroachdb/cockroach/pkg/sql.(*connExecutor).execStmt(0xc420b64000, 0x5f9a1e0, 0xc4210296b0, 0x5f9c820, 0xc42144ec80, 0x0, 0x0, 0x0, 0x0, 0x0, ...)
	/Users/alex/go/src/github.com/cockroachdb/cockroach/pkg/sql/conn_executor_exec.go:92 +0x358
github.com/cockroachdb/cockroach/pkg/sql.(*connExecutor).run(0xc420b64000, 0x5f9a1e0, 0xc421029170, 0x0, 0x0, 0x0)
	/Users/alex/go/src/github.com/cockroachdb/cockroach/pkg/sql/conn_executor.go:1051 +0x2039
github.com/cockroachdb/cockroach/pkg/sql.(*internalExecutorImpl).initConnEx.func1(0xc420b64000, 0x5f9a1e0, 0xc421029170, 0xc4203cd900, 0xc420960fd0)
	/Users/alex/go/src/github.com/cockroachdb/cockroach/pkg/sql/internal.go:211 +0x86
created by github.com/cockroachdb/cockroach/pkg/sql.(*internalExecutorImpl).initConnEx
	/Users/alex/go/src/github.com/cockroachdb/cockroach/pkg/sql/internal.go:200 +0x268
FAIL	github.com/cockroachdb/cockroach/pkg/server	6.750s

Metadata

Metadata

Assignees

Labels

C-bugCode not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior.S-2-temp-unavailabilityTemp crashes or other availability problems. Can be worked around or resolved by restarting.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions