-
Notifications
You must be signed in to change notification settings - Fork 4.1k
jobs: TestJobInfoUpgradeRegressionTests failed #106347
Copy link
Copy link
Closed
Labels
A-disaster-recoveryA-jobsC-test-failureBroken test (automatically or manually discovered).Broken test (automatically or manually discovered).O-robotOriginated from a bot.Originated from a bot.T-jobsbranch-masterFailures and bugs on the master branch.Failures and bugs on the master branch.release-blockerIndicates a release-blocker. Use with branch-release-2x.x label to denote which branch is blocked.Indicates a release-blocker. Use with branch-release-2x.x label to denote which branch is blocked.v23.1.9
Milestone
Description
jobs.TestJobInfoUpgradeRegressionTests failed with artifacts on master @ 818aec861357579eb3a3e987cf5887f3cf112be4:
I230706 22:11:07.395057 1840 upgrade/upgradecluster/cluster.go:121 [T1,n1,client=127.0.0.1:41314,hostssl,user=root,migration-mgr] 1767 executing bump-cluster-version=1000022.2-77(fence) on nodes n{1}
I230706 22:11:07.404103 16892 server/migration.go:150 [T1,n1,bump-cluster-version] 1768 active cluster version setting is now 1000022.2-77(fence) (up from 1000022.2-76)
I230706 22:11:07.404575 1840 upgrade/upgrademanager/manager.go:657 [T1,n1,client=127.0.0.1:41314,hostssl,user=root,migration-mgr] 1769 executing operation validate-cluster-version=1000022.2-78
I230706 22:11:07.404985 1840 upgrade/upgradecluster/cluster.go:121 [T1,n1,client=127.0.0.1:41314,hostssl,user=root,migration-mgr] 1770 executing validate-cluster-version=1000022.2-78 on nodes n{1}
I230706 22:11:07.406167 1840 upgrade/upgrademanager/manager.go:657 [T1,n1,client=127.0.0.1:41314,hostssl,user=root,migration-mgr] 1771 executing operation bump-cluster-version=1000022.2-78
I230706 22:11:07.406594 1840 upgrade/upgradecluster/cluster.go:121 [T1,n1,client=127.0.0.1:41314,hostssl,user=root,migration-mgr] 1772 executing bump-cluster-version=1000022.2-78 on nodes n{1}
I230706 22:11:07.406999 16897 server/migration.go:150 [T1,n1,bump-cluster-version] 1773 active cluster version setting is now 1000022.2-78 (up from 1000022.2-77(fence))
I230706 22:11:07.421796 1840 upgrade/upgrademanager/manager.go:517 [T1,n1,client=127.0.0.1:41314,hostssl,user=root,migration-mgr] 1774 stepping through 1000022.2-80
I230706 22:11:07.421963 1840 upgrade/upgrademanager/manager.go:657 [T1,n1,client=127.0.0.1:41314,hostssl,user=root,migration-mgr] 1775 executing operation bump-cluster-version=1000022.2-79(fence)
I230706 22:11:07.422398 1840 upgrade/upgradecluster/cluster.go:121 [T1,n1,client=127.0.0.1:41314,hostssl,user=root,migration-mgr] 1776 executing bump-cluster-version=1000022.2-79(fence) on nodes n{1}
I230706 22:11:07.423272 16939 server/migration.go:150 [T1,n1,bump-cluster-version] 1777 active cluster version setting is now 1000022.2-79(fence) (up from 1000022.2-78)
I230706 22:11:07.424778 1840 upgrade/upgrademanager/manager.go:657 [T1,n1,client=127.0.0.1:41314,hostssl,user=root,migration-mgr] 1778 executing operation validate-cluster-version=1000022.2-80
I230706 22:11:07.425030 1840 upgrade/upgradecluster/cluster.go:121 [T1,n1,client=127.0.0.1:41314,hostssl,user=root,migration-mgr] 1779 executing validate-cluster-version=1000022.2-80 on nodes n{1}
I230706 22:11:07.450150 16863 jobs/adopt.go:261 [T1,n1] 1781 job 880184745739550721: resuming execution
I230706 22:11:07.448976 1840 upgrade/upgrademanager/manager.go:742 [T1,n1,client=127.0.0.1:41314,hostssl,user=root,migration-mgr] 1780 running Upgrade to 1000022.2-80: "backfill the system.job_info table with the payload and progress of each job in the system.jobs table"
I230706 22:11:07.457530 16865 jobs/registry.go:1606 [T1,n1] 1782 MIGRATION job 880184745739550721: stepping through state running
I230706 22:11:07.547964 16865 upgrade/upgrades/backfill_job_info_table_migration.go:81 [T1,n1,job=MIGRATION id=880184745739550721,upgrade=1000022.2-80] 1783 backfilling job_info, step0, batch0 done; resume after 0, done false
I230706 22:11:07.551043 16865 upgrade/upgrades/backfill_job_info_table_migration.go:81 [T1,n1,job=MIGRATION id=880184745739550721,upgrade=1000022.2-80] 1784 backfilling job_info, step0, batch1 done; resume after 880184745739550721, done true
I230706 22:11:07.632088 16865 upgrade/upgrades/backfill_job_info_table_migration.go:81 [T1,n1,job=MIGRATION id=880184745739550721,upgrade=1000022.2-80] 1785 backfilling job_info, step1, batch0 done; resume after 0, done false
I230706 22:11:07.649981 16865 upgrade/upgrades/backfill_job_info_table_migration.go:81 [T1,n1,job=MIGRATION id=880184745739550721,upgrade=1000022.2-80] 1786 backfilling job_info, step1, batch1 done; resume after 880184745739550721, done true
I230706 22:11:07.651325 16865 jobs/registry.go:1606 [T1,n1] 1787 MIGRATION job 880184745739550721: stepping through state succeeded
I230706 22:11:07.662184 1840 jobs/wait.go:145 [T1,n1,client=127.0.0.1:41314,hostssl,user=root,migration-mgr] 1788 waited for 1 [880184745739550721] queued jobs to complete 210.019003ms
I230706 22:11:07.662257 1840 upgrade/upgrademanager/manager.go:657 [T1,n1,client=127.0.0.1:41314,hostssl,user=root,migration-mgr] 1789 executing operation bump-cluster-version=1000022.2-80
I230706 22:11:07.662566 1840 upgrade/upgradecluster/cluster.go:121 [T1,n1,client=127.0.0.1:41314,hostssl,user=root,migration-mgr] 1790 executing bump-cluster-version=1000022.2-80 on nodes n{1}
I230706 22:11:07.662831 17135 server/migration.go:150 [T1,n1,bump-cluster-version] 1791 active cluster version setting is now 1000022.2-80 (up from 1000022.2-79(fence))
I230706 22:11:07.667852 1840 util/log/event_log.go:32 [T1,n1,client=127.0.0.1:41314,hostssl,user=root] 1792 ={"Timestamp":1688681461014926435,"EventType":"set_cluster_setting","Statement":"SET CLUSTER SETTING version = $1","Tag":"SET CLUSTER SETTING","User":"root","PlaceholderValues":["'1000022.2-80'"],"SettingName":"version","Value":"1000022.2-80"}
job_info_storage_test.go:366: query 'SELECT count(*) FROM crdb_internal.system_jobs WHERE job_type = 'BACKUP'': expected:
1
got:
0
W230706 22:11:07.756134 17097 kv/kvserver/intentresolver/intent_resolver.go:826 [-] 1793 failed to gc transaction record: could not GC completed transaction anchored at /Table/6/1/"version"/0: node unavailable; try another peer
I230706 22:11:07.756204 900 sql/stats/automatic_stats.go:572 [T1,n1] 1794 quiescing auto stats refresher
I230706 22:11:07.756382 10921 jobs/registry.go:1606 [T1,n1] 1795 KEY VISUALIZER job 100: stepping through state succeeded
W230706 22:11:07.758610 10921 jobs/adopt.go:531 [T1,n1] 1796 could not clear job claim: clear-job-claim: failed to send RPC: sending to all replicas failed; last error: ba: Scan [/Table/15/1/100,/Table/15/1/101), [txn: cac76053], [can-forward-ts] RPC error: node unavailable; try another peer
I230706 22:11:07.759080 901 sql/stats/automatic_stats.go:624 [T1,n1] 1797 quiescing stats garbage collector
I230706 22:11:07.759309 373 server/start_listen.go:103 [T1,n1] 1798 server shutting down: instructing cmux to stop accepting
I230706 22:11:07.762217 9363 jobs/registry.go:1606 [T1,n1] 1799 AUTO SPAN CONFIG RECONCILIATION job 880184732354183169: stepping through state succeeded
W230706 22:11:07.762427 11268 jobs/adopt.go:531 [T1,n1] 1800 could not clear job claim: clear-job-claim: node unavailable; try another peer
W230706 22:11:07.762529 650 sql/sqlliveness/slinstance/slinstance.go:334 [T1,n1] 1801 exiting heartbeat loop
W230706 22:11:07.764876 650 sql/sqlliveness/slinstance/slinstance.go:321 [T1,n1] 1804 exiting heartbeat loop with error: node unavailable; try another peer
I230706 22:11:07.762669 977 jobs/registry.go:1606 [T1,n1] 1802 AUTO SPAN CONFIG RECONCILIATION job 880184715132862465: stepping through state succeeded
W230706 22:11:07.764785 9363 jobs/adopt.go:531 [T1,n1] 1803 could not clear job claim: clear-job-claim: node unavailable; try another peer
E230706 22:11:07.765004 650 server/server_sql.go:514 [T1,n1] 1805 failed to run update of instance with new session ID: node unavailable; try another peer
E230706 22:11:07.765174 977 jobs/registry.go:1004 [T1,n1] 1806 error getting live session: node unavailable; try another peer
I230706 22:11:07.768845 58 server/server_controller_orchestration.go:263 [T1,n1] 1807 server controller shutting down ungracefully
I230706 22:11:07.769028 58 server/server_controller_orchestration.go:274 [T1,n1] 1808 waiting for tenant servers to report stopped
W230706 22:11:07.769212 58 server/server_sql.go:1712 [T1,n1] 1809 server shutdown without a prior graceful drain
--- FAIL: TestJobInfoUpgradeRegressionTests (9.81s)
Same failure on other branches
- jobs: TestJobInfoUpgradeRegressionTests failed #106246 jobs: TestJobInfoUpgradeRegressionTests failed [C-test-failure O-robot T-jobs branch-release-23.1]
This test on roachdash | Improve this report!
Jira issue: CRDB-29520
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
A-disaster-recoveryA-jobsC-test-failureBroken test (automatically or manually discovered).Broken test (automatically or manually discovered).O-robotOriginated from a bot.Originated from a bot.T-jobsbranch-masterFailures and bugs on the master branch.Failures and bugs on the master branch.release-blockerIndicates a release-blocker. Use with branch-release-2x.x label to denote which branch is blocked.Indicates a release-blocker. Use with branch-release-2x.x label to denote which branch is blocked.v23.1.9