-
Notifications
You must be signed in to change notification settings - Fork 4.1k
sql: TestTruncateWhileColumnBackfill failed under stress #27687
Copy link
Copy link
Closed
Labels
A-schema-changesC-test-failureBroken test (automatically or manually discovered).Broken test (automatically or manually discovered).O-robotOriginated from a bot.Originated from a bot.
Milestone
Description
SHA: https://github.com/cockroachdb/cockroach/commits/eed443af32474fd76d3b860b44ebfa339b249703
Parameters:
TAGS=
GOFLAGS=
Failed test: https://teamcity.cockroachdb.com/viewLog.html?buildId=784093&tab=buildLog
=== RUN TestTruncateWhileColumnBackfill
W180718 06:00:35.798307 140236 server/status/runtime.go:143 Could not parse build timestamp: parsing time "" as "2006/01/02 15:04:05": cannot parse "" as "2006"
I180718 06:00:35.810167 140236 server/server.go:795 [n?] monitoring forward clock jumps based on server.clock.forward_jump_check_enabled
I180718 06:00:35.812809 140236 server/config.go:552 [n?] 1 storage engine initialized
I180718 06:00:35.812834 140236 server/config.go:555 [n?] RocksDB cache size: 128 MiB
I180718 06:00:35.812842 140236 server/config.go:555 [n?] store 0: in-memory, size 0 B
I180718 06:00:35.816066 140236 server/node.go:421 [n?] **** cluster 5f472d15-2c7e-45fe-a9d8-9aeaff818ec1 has been created
I180718 06:00:35.816120 140236 server/server.go:1370 [n?] **** add additional nodes by specifying --join=127.0.0.1:41909
I180718 06:00:35.816848 140236 storage/store.go:1480 [n1,s1] [n1,s1]: failed initial metrics computation: [n1,s1]: system config not yet available
I180718 06:00:35.816934 140236 server/node.go:551 [n1] initialized store [n1,s1]: disk (capacity=512 MiB, available=512 MiB, used=0 B, logicalBytes=6.9 KiB), ranges=1, leases=0, writes=0.00, bytesPerReplica={p10=7043.00 p25=7043.00 p50=7043.00 p75=7043.00 p90=7043.00 pMax=7043.00}, writesPerReplica={p10=0.00 p25=0.00 p50=0.00 p75=0.00 p90=0.00 pMax=0.00}
I180718 06:00:35.816987 140236 server/node.go:399 [n1] node ID 1 initialized
I180718 06:00:35.817054 140236 gossip/gossip.go:363 [n1] NodeDescriptor set to node_id:1 address:<network_field:"tcp" address_field:"127.0.0.1:41909" > attrs:<> locality:<> ServerVersion:<major_val:2 minor_val:0 patch:0 unstable:9 >
I180718 06:00:35.817205 140236 storage/stores.go:222 [n1] read 0 node addresses from persistent storage
I180718 06:00:35.817336 140236 server/node.go:692 [n1] connecting to gossip network to verify cluster ID...
I180718 06:00:35.817815 140236 server/node.go:717 [n1] node connected via gossip and verified as part of cluster "5f472d15-2c7e-45fe-a9d8-9aeaff818ec1"
I180718 06:00:35.817843 140236 server/node.go:485 [n1] node=1: started with [<no-attributes>=<in-mem>] engine(s) and attributes []
I180718 06:00:35.818095 140236 server/config.go:336 [n1] available memory from cgroups (8.0 EiB) exceeds system memory 16 GiB, using system memory
I180718 06:00:35.818115 140236 server/server.go:1799 [n1] Could not start heap profiler worker due to: directory to store profiles could not be determined
I180718 06:00:35.818170 140236 server/server.go:1507 [n1] starting https server at 127.0.0.1:33701
I180718 06:00:35.818185 140236 server/server.go:1508 [n1] starting grpc/postgres server at 127.0.0.1:41909
I180718 06:00:35.818197 140236 server/server.go:1509 [n1] advertising CockroachDB node at 127.0.0.1:41909
I180718 06:00:35.819980 140479 storage/replica_command.go:279 [split,n1,s1,r1/1:/M{in-ax}] initiating a split of this range at key /System/"" [r2]
I180718 06:00:35.850005 140500 storage/replica_command.go:279 [split,n1,s1,r2/1:/{System/-Max}] initiating a split of this range at key /System/NodeLiveness [r3]
I180718 06:00:35.855250 140225 sql/event_log.go:126 [n1,intExec=optInToDiagnosticsStatReporting] Event: "set_cluster_setting", target: 0, info: {SettingName:diagnostics.reporting.enabled Value:true User:root}
I180718 06:00:35.858839 140549 storage/replica_command.go:279 [split,n1,s1,r3/1:/{System/NodeL…-Max}] initiating a split of this range at key /System/NodeLivenessMax [r4]
I180718 06:00:35.871268 140558 storage/replica_command.go:279 [split,n1,s1,r4/1:/{System/NodeL…-Max}] initiating a split of this range at key /System/tsd [r5]
W180718 06:00:35.878149 140555 storage/intent_resolver.go:642 [n1,s1] failed to push during intent resolution: failed to push "sql txn" id=087969e6 key=/Table/SystemConfigSpan/Start rw=true pri=0.04903586 iso=SERIALIZABLE stat=PENDING epo=0 ts=1531893635.857629636,0 orig=1531893635.857629636,0 max=1531893635.857629636,0 wto=false rop=false seq=12
W180718 06:00:35.880689 140284 storage/intent_resolver.go:642 [n1,s1] failed to push during intent resolution: failed to push "sql txn" id=087969e6 key=/Table/SystemConfigSpan/Start rw=true pri=0.04903586 iso=SERIALIZABLE stat=PENDING epo=0 ts=1531893635.857629636,0 orig=1531893635.857629636,0 max=1531893635.857629636,0 wto=false rop=false seq=12
I180718 06:00:35.880861 140287 storage/replica_command.go:279 [split,n1,s1,r5/1:/{System/tsd-Max}] initiating a split of this range at key /System/"tse" [r6]
I180718 06:00:35.886469 140547 sql/event_log.go:126 [n1,intExec=set-setting] Event: "set_cluster_setting", target: 0, info: {SettingName:version Value:$1 User:root}
I180718 06:00:35.888893 140595 storage/replica_command.go:279 [split,n1,s1,r6/1:/{System/tse-Max}] initiating a split of this range at key /Table/SystemConfigSpan/Start [r7]
I180718 06:00:35.898822 140504 storage/replica_command.go:279 [split,n1,s1,r7/1:/{Table/System…-Max}] initiating a split of this range at key /Table/11 [r8]
W180718 06:00:35.899859 140139 storage/intent_resolver.go:642 [n1,s1] failed to push during intent resolution: failed to push "sql txn" id=2cd11d4b key=/Table/SystemConfigSpan/Start rw=true pri=0.00590906 iso=SERIALIZABLE stat=PENDING epo=0 ts=1531893635.886973291,0 orig=1531893635.886973291,0 max=1531893635.886973291,0 wto=false rop=false seq=6
I180718 06:00:35.901288 140567 sql/event_log.go:126 [n1,intExec=disableNetTrace] Event: "set_cluster_setting", target: 0, info: {SettingName:trace.debug.enable Value:false User:root}
I180718 06:00:35.906231 140604 storage/replica_command.go:279 [split,n1,s1,r8/1:/{Table/11-Max}] initiating a split of this range at key /Table/12 [r9]
I180718 06:00:35.915104 140577 storage/replica_command.go:279 [split,n1,s1,r9/1:/{Table/12-Max}] initiating a split of this range at key /Table/13 [r10]
I180718 06:00:35.918504 140658 storage/replica_command.go:279 [split,n1,s1,r10/1:/{Table/13-Max}] initiating a split of this range at key /Table/14 [r11]
I180718 06:00:35.922794 140669 storage/replica_command.go:279 [split,n1,s1,r11/1:/{Table/14-Max}] initiating a split of this range at key /Table/15 [r12]
I180718 06:00:35.937629 140186 storage/replica_command.go:279 [split,n1,s1,r12/1:/{Table/15-Max}] initiating a split of this range at key /Table/16 [r13]
I180718 06:00:35.939943 140636 sql/event_log.go:126 [n1,intExec=initializeClusterSecret] Event: "set_cluster_setting", target: 0, info: {SettingName:cluster.secret Value:gen_random_uuid()::STRING User:root}
I180718 06:00:35.947892 140654 storage/replica_command.go:279 [split,n1,s1,r13/1:/{Table/16-Max}] initiating a split of this range at key /Table/17 [r14]
I180718 06:00:35.956696 140678 storage/replica_command.go:279 [split,n1,s1,r14/1:/{Table/17-Max}] initiating a split of this range at key /Table/18 [r15]
I180718 06:00:35.957638 140639 sql/event_log.go:126 [n1,intExec=create-default-db] Event: "create_database", target: 50, info: {DatabaseName:defaultdb Statement:CREATE DATABASE IF NOT EXISTS defaultdb User:root}
W180718 06:00:35.961624 140623 storage/intent_resolver.go:642 [n1,s1] failed to push during intent resolution: failed to push "split" id=dc5dbea6 key=/Local/Range/Table/17/RangeDescriptor rw=true pri=0.01350758 iso=SERIALIZABLE stat=PENDING epo=0 ts=1531893635.956709502,0 orig=1531893635.956709502,0 max=1531893635.956709502,0 wto=false rop=false seq=1
I180718 06:00:35.964113 140620 sql/event_log.go:126 [n1,intExec=create-default-db] Event: "create_database", target: 51, info: {DatabaseName:postgres Statement:CREATE DATABASE IF NOT EXISTS postgres User:root}
I180718 06:00:35.965689 140236 server/server.go:1588 [n1] done ensuring all necessary migrations have run
I180718 06:00:35.965713 140236 server/server.go:1591 [n1] serving sql connections
I180718 06:00:35.969556 140726 server/server_update.go:65 [n1] no need to upgrade, cluster already at the newest version
I180718 06:00:35.972183 140728 sql/event_log.go:126 [n1] Event: "node_join", target: 1, info: {Descriptor:{NodeID:1 Address:{NetworkField:tcp AddressField:127.0.0.1:41909} Attrs: Locality: ServerVersion:2.0-9} ClusterID:5f472d15-2c7e-45fe-a9d8-9aeaff818ec1 StartedAt:1531893635817824229 LastUp:1531893635817824229}
I180718 06:00:36.017522 140710 storage/replica_command.go:279 [split,n1,s1,r15/1:/{Table/18-Max}] initiating a split of this range at key /Table/19 [r16]
I180718 06:00:36.022151 140192 storage/replica_command.go:279 [split,n1,s1,r16/1:/{Table/19-Max}] initiating a split of this range at key /Table/20 [r17]
I180718 06:00:36.027443 140743 storage/replica_command.go:279 [split,n1,s1,r17/1:/{Table/20-Max}] initiating a split of this range at key /Table/21 [r18]
I180718 06:00:36.032167 140751 storage/replica_command.go:279 [split,n1,s1,r18/1:/{Table/21-Max}] initiating a split of this range at key /Table/22 [r19]
W180718 06:00:36.038594 140789 storage/intent_resolver.go:642 [n1,s1] failed to push during intent resolution: failed to push "split" id=199d441e key=/Local/Range/Table/21/RangeDescriptor rw=true pri=0.01348378 iso=SERIALIZABLE stat=PENDING epo=0 ts=1531893636.036955778,1 orig=1531893636.033966335,0 max=1531893636.033966335,0 wto=false rop=false seq=1
I180718 06:00:36.041813 140779 storage/replica_command.go:279 [split,n1,s1,r19/1:/{Table/22-Max}] initiating a split of this range at key /Table/23 [r20]
I180718 06:00:36.097673 140545 storage/replica_command.go:279 [split,n1,s1,r20/1:/{Table/23-Max}] initiating a split of this range at key /Table/50 [r21]
I180718 06:00:36.101754 140834 storage/replica_command.go:279 [split,n1,s1,r21/1:/{Table/50-Max}] initiating a split of this range at key /Table/51 [r22]
I180718 06:00:36.142862 140851 sql/event_log.go:126 [n1,client=127.0.0.1:51940,user=root] Event: "create_database", target: 52, info: {DatabaseName:t Statement:CREATE DATABASE t User:root}
I180718 06:00:36.145786 140807 storage/replica_command.go:279 [split,n1,s1,r22/1:/{Table/51-Max}] initiating a split of this range at key /Table/52 [r23]
I180718 06:00:36.146019 140851 sql/event_log.go:126 [n1,client=127.0.0.1:51940,user=root] Event: "create_table", target: 53, info: {TableName:t.public.test Statement:CREATE TABLE t.public.test (k INT PRIMARY KEY, v INT) User:root}
I180718 06:00:36.184419 140704 storage/replica_command.go:279 [split,n1,s1,r23/1:/{Table/52-Max}] initiating a split of this range at key /Table/53 [r24]
I180718 06:00:36.380397 140851 sql/lease.go:312 [n1,client=127.0.0.1:51940,user=root] publish: descID=53 (test) version=2 mtime=2018-07-18 06:00:36.378101177 +0000 UTC
I180718 06:00:36.384192 140851 sql/event_log.go:126 [n1,client=127.0.0.1:51940,user=root] Event: "alter_table", target: 53, info: {TableName:t.public.test Statement:ALTER TABLE t.public.test ADD COLUMN x DECIMAL NOT NULL DEFAULT 1.4::DECIMAL User:root MutationID:1 CascadeDroppedViews:[]}
I180718 06:00:36.386776 140839 sql/lease.go:858 new lease: 53("test") ver=2:1531893887.789506692,0, refcount=0
I180718 06:00:36.401384 140851 sql/lease.go:312 [n1,client=127.0.0.1:51940,user=root] publish: descID=53 (test) version=3 mtime=2018-07-18 06:00:36.385465955 +0000 UTC
I180718 06:00:36.419942 140488 sql/lease.go:312 [n1,scExec=] publish: descID=53 (test) version=3 mtime=2018-07-18 06:00:36.419802601 +0000 UTC
I180718 06:00:36.422747 140488 sql/lease.go:286 publish (1 leases): desc=[{test 53 2}]
I180718 06:00:36.424806 140932 sql/lease.go:858 new lease: 53("test") ver=3:1531894002.492974377,0, refcount=0
I180718 06:00:36.436825 140851 sql/lease.go:312 [n1,client=127.0.0.1:51940,user=root] publish: descID=53 (test) version=4 mtime=2018-07-18 06:00:36.423983012 +0000 UTC
I180718 06:00:36.445916 140851 sql/lease.go:312 [n1,client=127.0.0.1:51940,user=root] publish: descID=53 (test) version=4 mtime=2018-07-18 06:00:36.444228515 +0000 UTC
I180718 06:00:36.445989 140488 sql/backfill.go:135 [n1,scExec=] Running backfill for "test", v=3, m=1
I180718 06:00:36.452720 140851 sql/lease.go:312 [n1,client=127.0.0.1:51940,user=root] publish: descID=53 (test) version=4 mtime=2018-07-18 06:00:36.446008979 +0000 UTC
I180718 06:00:36.457660 140851 sql/event_log.go:126 [n1,client=127.0.0.1:51940,user=root] Event: "alter_table", target: 53, info: {TableName:t.public.test Statement:ALTER TABLE t.public.test DROP COLUMN v User:root MutationID:2 CascadeDroppedViews:[]}
I180718 06:00:36.457690 140851 sql/event_log.go:126 [n1,client=127.0.0.1:51940,user=root] Event: "alter_table", target: 53, info: {TableName:t.public.test Statement:ALTER TABLE t.public.test DROP COLUMN v User:root MutationID:2 CascadeDroppedViews:[]}
I180718 06:00:36.459828 140872 sql/lease.go:858 new lease: 53("test") ver=4:1531894002.493974377,0, refcount=0
W180718 06:00:36.460468 140488 sql/schema_changer.go:658 [n1,scExec=] reversing schema change 366422380965888001 due to irrecoverable error: can't find job 0: job with ID 0 does not exist
W180718 06:00:36.472071 140488 sql/schema_changer.go:924 [n1,scExec=] reverse schema change mutation: {Descriptor_:0xc423597b30 State:DELETE_AND_WRITE_ONLY Direction:ADD MutationID:1 Rollback:false}
I180718 06:00:36.480522 140488 sql/lease.go:312 [n1,scExec=] publish: descID=53 (test) version=5 mtime=2018-07-18 06:00:36.466588727 +0000 UTC
I180718 06:00:36.483347 140488 sql/event_log.go:126 [n1,scExec=] Event: "reverse_schema_change", target: 53, info: {Error:job with ID 0 does not exist
can't find job 0
github.com/cockroachdb/cockroach/pkg/sql/distsqlrun.GetResumeSpansFromJob
/go/src/github.com/cockroachdb/cockroach/pkg/sql/distsqlrun/backfiller.go:155
github.com/cockroachdb/cockroach/pkg/sql.(*SchemaChanger).distBackfill
/go/src/github.com/cockroachdb/cockroach/pkg/sql/backfill.go:444
github.com/cockroachdb/cockroach/pkg/sql.(*SchemaChanger).truncateAndBackfillColumns
/go/src/github.com/cockroachdb/cockroach/pkg/sql/backfill.go:579
github.com/cockroachdb/cockroach/pkg/sql.(*SchemaChanger).runBackfill
/go/src/github.com/cockroachdb/cockroach/pkg/sql/backfill.go:191
github.com/cockroachdb/cockroach/pkg/sql.(*SchemaChanger).runStateMachineAndBackfill
/go/src/github.com/cockroachdb/cockroach/pkg/sql/schema_changer.go:871
github.com/cockroachdb/cockroach/pkg/sql.(*SchemaChanger).exec
/go/src/github.com/cockroachdb/cockroach/pkg/sql/schema_changer.go:638
github.com/cockroachdb/cockroach/pkg/sql.(*SchemaChangeManager).Start.func1.1
/go/src/github.com/cockroachdb/cockroach/pkg/sql/schema_changer.go:1189
github.com/cockroachdb/cockroach/pkg/sql.(*SchemaChangeManager).Start.func1
/go/src/github.com/cockroachdb/cockroach/pkg/sql/schema_changer.go:1366
github.com/cockroachdb/cockroach/pkg/util/stop.(*Stopper).RunWorker.func1
/go/src/github.com/cockroachdb/cockroach/pkg/util/stop/stopper.go:195
runtime.goexit
/usr/local/go/src/runtime/asm_amd64.s:2361 MutationID:1}
I180718 06:00:36.486174 140488 sql/lease.go:286 publish (1 leases): desc=[{test 53 4}]
I180718 06:00:36.487688 140764 sql/lease.go:858 new lease: 53("test") ver=5:1531894002.494974377,0, refcount=0
I180718 06:00:36.544356 140488 sql/lease.go:312 [n1,scExec=] publish: descID=53 (test) version=6 mtime=2018-07-18 06:00:36.544199799 +0000 UTC
I180718 06:00:36.548696 141018 sql/lease.go:858 new lease: 53("test") ver=6:1531894002.495974377,0, refcount=0
I180718 06:00:36.554582 140488 sql/backfill.go:135 [n1,scExec=] Running backfill for "test", v=6, m=1
I180718 06:00:36.575436 140851 sql/lease.go:312 [n1,client=127.0.0.1:51940,user=root] publish: descID=53 (test) version=7 mtime=2018-07-18 06:00:36.574447293 +0000 UTC
I180718 06:00:36.579631 140851 sql/event_log.go:126 [n1,client=127.0.0.1:51940,user=root] Event: "truncate_table", target: 53, info: {TableName:t.public.test Statement:TRUNCATE TABLE t.test User:root}
E180718 06:00:36.580134 140306 sql/lease.go:1552 /Table/3/1/54/2/1: received invalid table descriptor: family "primary" contains unknown column "2". Desc: name:"test" id:54 parent_id:52 version:1 up_version:false modification_time:<wall_time:1531893636544199799 > columns:<name:"k" id:1 type:<semantic_type:INT width:0 precision:0 visible_type:NONE > nullable:false hidden:false > next_column_id:4 families:<name:"primary" id:0 column_names:"k" column_names:"v" column_names:"x" column_ids:1 column_ids:2 column_ids:3 default_column_id:2 > next_family_id:1 primary_index:<name:"primary" id:1 unique:true column_names:"k" column_directions:ASC column_ids:1 foreign_key:<table:0 index:0 name:"" validity:Validated shared_prefix_len:0 on_delete:NO_ACTION on_update:NO_ACTION > interleave:<> partitioning:<num_columns:0 > type:FORWARD > next_index_id:2 privileges:<users:<user:"admin" privileges:2 > users:<user:"root" privileges:2 > > lease:<node_id:1 expiration_time:1531893936406815644 > next_mutation_id:3 format_version:3 state:PUBLIC view_query:"" mutationJobs:<mutation_id:1 job_id:366422381271744513 > mutationJobs:<mutation_id:2 job_id:366422381202178049 > drop_time:0 replacement_of:<id:53 time:<wall_time:1531893636574447293 > > audit_mode:DISABLED
I180718 06:00:36.580467 140875 storage/replica_command.go:279 [split,n1,s1,r24/1:/{Table/53-Max}] initiating a split of this range at key /Table/54 [r25]
W180718 06:00:36.584504 140488 sql/schema_changer.go:679 [n1,scExec=] error purging mutation: context deadline exceeded, after error: can't find job 0: job with ID 0 does not exist
I180718 06:00:36.595321 140236 util/stop/stopper.go:473 quiescing; tasks left:
2 [async] kv.TxnCoordSender: heartbeat loop
1 node.Node: batch
1 [async] storage.split: processing replica
1 [async] sql.tableState: releasing descriptor lease
W180718 06:00:36.595439 140875 storage/replica.go:3097 [n1,s1,r24/1:/Table/5{3-4}] during async intent resolution: node unavailable; try another peer
I180718 06:00:36.595544 140236 util/stop/stopper.go:473 quiescing; tasks left:
2 [async] kv.TxnCoordSender: heartbeat loop
1 [async] sql.tableState: releasing descriptor lease
I180718 06:00:36.595601 140236 util/stop/stopper.go:473 quiescing; tasks left:
1 [async] sql.tableState: releasing descriptor lease
1 [async] kv.TxnCoordSender: heartbeat loop
I180718 06:00:36.595640 140236 util/stop/stopper.go:473 quiescing; tasks left:
1 [async] sql.tableState: releasing descriptor lease
W180718 06:00:36.596309 141028 sql/lease.go:229 error releasing lease "53(\"test\") ver=6:1531894002.495974377,0, refcount=0": lease-release: node unavailable; try another peer
W180718 06:00:36.600425 140488 sql/schema_changer.go:600 [n1,scExec=] failed to send RPC: sending to all 1 replicas failed; last error: (err: node unavailable; try another peer) <nil>
W180718 06:00:36.600542 140488 sql/schema_changer.go:563 [n1,scExec=] node unavailable; try another peer
W180718 06:00:36.600585 140488 sql/schema_changer.go:1199 [n1] Error executing schema change: can't find job 0: job with ID 0 does not exist
--- FAIL: TestTruncateWhileColumnBackfill (0.82s)
schema_changer_test.go:2875: 1 columns
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
A-schema-changesC-test-failureBroken test (automatically or manually discovered).Broken test (automatically or manually discovered).O-robotOriginated from a bot.Originated from a bot.