Skip to content

jobs: TestJobInfoUpgradeRegressionTests failed #104653

@cockroach-teamcity

Description

@cockroach-teamcity

jobs.TestJobInfoUpgradeRegressionTests failed with artifacts on release-23.1 @ 2d33fa6b041ba0a7b362358c9b69fcabe005edba:

I230609 09:00:10.325402 10896111 13@kv/kvserver/replicate_queue.go:847  [T1,n1,replicate,s1,r65/1:/Table/10{7-8}] 2034  error processing replica: 0 of 1 live stores are able to take a new replica for the range (1 already has a voter, 0 already have a non-voter); likely not enough nodes in cluster
I230609 09:00:10.327298 10896111 13@kv/kvserver/replicate_queue.go:847  [T1,n1,replicate,s1,r69/1:/Table/5{1/4-2}] 2035  error processing replica: 0 of 1 live stores are able to take a new replica for the range (1 already has a voter, 0 already have a non-voter); likely not enough nodes in cluster
I230609 09:00:10.329706 10896111 13@kv/kvserver/replicate_queue.go:847  [T1,n1,replicate,s1,r38/1:/Table/3{7-8}] 2036  error processing replica: 0 of 1 live stores are able to take a new replica for the range (1 already has a voter, 0 already have a non-voter); likely not enough nodes in cluster
I230609 09:00:10.332830 10896111 13@kv/kvserver/replicate_queue.go:847  [T1,n1,replicate,s1,r34/1:/Table/3{3-4}] 2037  error processing replica: 0 of 1 live stores are able to take a new replica for the range (1 already has a voter, 0 already have a non-voter); likely not enough nodes in cluster
I230609 09:00:10.334179 10896111 13@kv/kvserver/replicate_queue.go:847  [T1,n1,replicate,s1,r13/1:/Table/1{2-3}] 2038  error processing replica: 0 of 1 live stores are able to take a new replica for the range (1 already has a voter, 0 already have a non-voter); likely not enough nodes in cluster
I230609 09:00:10.335557 10896111 13@kv/kvserver/replicate_queue.go:847  [T1,n1,replicate,s1,r33/1:/Table/3{2-3}] 2039  error processing replica: 0 of 1 live stores are able to take a new replica for the range (1 already has a voter, 0 already have a non-voter); likely not enough nodes in cluster
I230609 09:00:10.559139 16866408 jobs/registry.go:1583  [T1,n1] 2040  AUTO CREATE STATS job 872385119179898881: stepping through state running
I230609 09:00:10.779049 16795006 upgrade/upgrades/backfill_jobs_info_table_migration.go:72  [T1,n1,job=MIGRATION id=872385116792913921,upgrade=22.2-80] 2041  backfilling job_info, step0, batch0 done; resume after 872385116792913921, done false
I230609 09:00:10.865413 16795006 upgrade/upgrades/backfill_jobs_info_table_migration.go:72  [T1,n1,job=MIGRATION id=872385116792913921,upgrade=22.2-80] 2042  backfilling job_info, step0, batch1 done; resume after 872385119179898881, done false
I230609 09:00:10.933474 16795006 upgrade/upgrades/backfill_jobs_info_table_migration.go:72  [T1,n1,job=MIGRATION id=872385116792913921,upgrade=22.2-80] 2043  backfilling job_info, step0, batch2 done; resume after 0, done true
I230609 09:00:11.219867 16866408 jobs/registry.go:1583  [T1,n1] 2044  AUTO CREATE STATS job 872385119179898881: stepping through state succeeded
I230609 09:00:11.356493 13658185 sql/stats/automatic_stats.go:833  [T1,n1] 2045  automatically executing "CREATE STATISTICS __auto__ FROM [30] WITH OPTIONS THROTTLING 0.9 AS OF SYSTEM TIME '-30s'"
I230609 09:00:11.549773 16969174 jobs/registry.go:1583  [T1,n1] 2046  AUTO CREATE STATS job 872385122766618625: stepping through state running
I230609 09:00:11.744003 16795006 upgrade/upgrades/backfill_jobs_info_table_migration.go:72  [T1,n1,job=MIGRATION id=872385116792913921,upgrade=22.2-80] 2047  backfilling job_info, step1, batch0 done; resume after 872385119179898881, done false
I230609 09:00:11.815538 16795006 upgrade/upgrades/backfill_jobs_info_table_migration.go:72  [T1,n1,job=MIGRATION id=872385116792913921,upgrade=22.2-80] 2048  backfilling job_info, step1, batch0 done; resume after 872385122766618625, done false
I230609 09:00:11.859987 16795006 upgrade/upgrades/backfill_jobs_info_table_migration.go:72  [T1,n1,job=MIGRATION id=872385116792913921,upgrade=22.2-80] 2049  backfilling job_info, step1, batch1 done; resume after 0, done true
I230609 09:00:11.884631 16795006 jobs/registry.go:1583  [T1,n1] 2050  MIGRATION job 872385116792913921: stepping through state succeeded
I230609 09:00:12.094775 16969174 jobs/registry.go:1583  [T1,n1] 2051  AUTO CREATE STATS job 872385122766618625: stepping through state succeeded
I230609 09:00:12.101499 10778809 jobs/wait.go:145  [T1,n1,client=127.0.0.1:33724,hostssl,user=root,migration-mgr] 2052  waited for 1 [872385116792913921] queued jobs to complete 2.395932067s
I230609 09:00:12.118206 10778809 upgrade/upgrademanager/manager.go:657  [T1,n1,client=127.0.0.1:33724,hostssl,user=root,migration-mgr] 2053  executing operation bump-cluster-version=22.2-80
I230609 09:00:12.123419 10778809 upgrade/upgradecluster/cluster.go:121  [T1,n1,client=127.0.0.1:33724,hostssl,user=root,migration-mgr] 2054  executing bump-cluster-version=22.2-80 on nodes n{1}
I230609 09:00:12.124170 17039390 server/migration.go:150  [T1,n1,bump-cluster-version] 2055  active cluster version setting is now 22.2-80 (up from 22.2-79(fence))
I230609 09:00:12.198130 10778809 util/log/event_log.go:32  [T1,n1,client=127.0.0.1:33724,hostssl,user=root] 2056 ={"Timestamp":1686301164882908040,"EventType":"set_cluster_setting","Statement":"SET CLUSTER SETTING version = $1","Tag":"SET CLUSTER SETTING","User":"root","PlaceholderValues":["'22.2-80'"],"SettingName":"version","Value":"22.2-80"}
    job_info_storage_test.go:355: query 'SELECT count(*) FROM system.job_info WHERE job_id = 872384953098731521': expected:
        4
        
        got:
        3
        
W230609 09:00:12.206234 17047980 kv/kvserver/intentresolver/intent_resolver.go:826  [-] 2057  failed to gc transaction record: could not GC completed transaction anchored at /Table/6/1/"version"/0: node unavailable; try another peer
I230609 09:00:12.206256 10463730 sql/stats/automatic_stats.go:550  [T1,n1] 2058  quiescing auto stats refresher
I230609 09:00:12.206583 10573493 rpc/context.go:2313  [T1,n1,rnode=1,raddr=127.0.0.1:45633,class=default,rpc] 2059  connection heartbeat loop ended with err: <nil>
I230609 09:00:12.207277 14294065 jobs/registry.go:1583  [T1,n1] 2061  KEY VISUALIZER job 100: stepping through state succeeded
I230609 09:00:12.207121 10447072 server/start_listen.go:103  [T1,n1] 2060  server shutting down: instructing cmux to stop accepting
W230609 09:00:12.207385 10456826 sql/sqlliveness/slinstance/slinstance.go:334  [T1,n1] 2062  exiting heartbeat loop
W230609 09:00:12.208842 10456826 sql/sqlliveness/slinstance/slinstance.go:321  [T1,n1] 2063  exiting heartbeat loop with error: node unavailable; try another peer
E230609 09:00:12.209185 10456826 server/server_sql.go:509  [T1,n1] 2064  failed to run update of instance with new session ID: node unavailable; try another peer
E230609 09:00:12.209697 10463770 jobs/registry.go:1032  [T1,n1] 2065  error expiring job sessions: expire-sessions: failed to read query result: query execution canceled
I230609 09:00:12.210044 10503051 jobs/registry.go:1583  [T1,n1] 2066  AUTO SPAN CONFIG RECONCILIATION job 872384922406486017: stepping through state succeeded
E230609 09:00:12.210408 10503051 jobs/registry.go:988  [T1,n1] 2067  error getting live session: node unavailable; try another peer
E230609 09:00:12.211193 10463770 jobs/registry.go:1040  [T1,n1] 2068  failed to serve pause and cancel requests: txn exec: context canceled
E230609 09:00:12.213162 10463780 jobs/registry.go:1100  [T1,n1] 2069  error processing claimed jobs: could not query for claimed jobs: select-running/get-claimed-jobs: failed to read query result: query execution canceled
W230609 09:00:12.213500 17047158 1@sql/event_log.go:653  [T1,n1,client=127.0.0.1:33724,hostssl,user=root] 2070  unable to save 1 entries to system.eventlog: log-event: failed to read query result: query execution canceled
W230609 09:00:12.216976 13658185 sql/stats/automatic_stats.go:819  [T1,n1] 2071  failed to create statistics on table 30: create-stats: job 872385122766618625: could not mark as succeeded: job 872385122766618625: write-job-info-insert: failed to read query result: query execution canceled
W230609 09:00:12.225290 14443267 jobs/adopt.go:518  [T1,n1] 2072  could not clear job claim: clear-job-claim: failed to send RPC: sending to all replicas failed; last error: ba: Scan [/Table/15/1/101,/Table/15/1/102), [txn: 7c3e4c58], [can-forward-ts] RPC error: node unavailable; try another peer
W230609 09:00:12.232215 14294065 jobs/adopt.go:518  [T1,n1] 2073  could not clear job claim: clear-job-claim: node unavailable; try another peer
I230609 09:00:12.238466 10441282 server/server_controller_orchestration.go:263  [T1,n1] 2074  server controller shutting down ungracefully
I230609 09:00:12.238605 10441282 server/server_controller_orchestration.go:274  [T1,n1] 2075  waiting for tenant servers to report stopped
W230609 09:00:12.239941 10441282 server/server_sql.go:1710  [T1,n1] 2076  server shutdown without a prior graceful drain
--- FAIL: TestJobInfoUpgradeRegressionTests (62.27s)

Parameters: TAGS=bazel,gss,deadlock

Help

See also: How To Investigate a Go Test Failure (internal)

/cc @cockroachdb/jobs

This test on roachdash | Improve this report!

Jira issue: CRDB-28653

Metadata

Metadata

Assignees

Labels

C-test-failureBroken test (automatically or manually discovered).O-robotOriginated from a bot.T-jobsbranch-release-23.1Used to mark GA and release blockers, technical advisories, and bugs for 23.1release-blockerIndicates a release-blocker. Use with branch-release-2x.x label to denote which branch is blocked.v23.1.4

Type

No type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions