Skip to content

tpcc.setupTPCCMetrics must be idempotent--retry leads to unhandled panic due to prometheus.AlreadyRegisteredError #77627

@srosenberg

Description

@srosenberg

When running cockroach workload fixtures load tpcc, the load generator crashed (see below). The root cause is tpcc.setupTPCCMetrics incorrectly assumes that NewCounter will not panic. However, if the counter has already been registered after the first invocation of setupTPCCMetrics, the subsequent invocation will panic due to prometheus.AlreadyRegisteredError. Since, tpcc.setupTPCCMetrics is expected to be retried (see tpcc.Ops and workload.cli.run.runRun) it must be idempotent.

W220203 23:36:17.617076 1 workload/cli/run.go:423  [-] 15313  retrying after error while creating load: failed to initialize the load generator: dial tcp 10.150.0.52:26257: connect: cannot assign requested address
E220203 23:36:17.617243 1 1@util/log/logcrash/crash_reporting.go:174  [-] 15314 +duplicate metrics collector registration attempted
E220203 23:36:17.617243 1 1@util/log/logcrash/crash_reporting.go:174  [-] 15314 +(1) attached stack trace
E220203 23:36:17.617243 1 1@util/log/logcrash/crash_reporting.go:174  [-] 15314 +  -- stack trace:
E220203 23:36:17.617243 1 1@util/log/logcrash/crash_reporting.go:174  [-] 15314 +  | github.com/prometheus/client_golang/prometheus/promauto.Factory.NewCounter
E220203 23:36:17.617243 1 1@util/log/logcrash/crash_reporting.go:174  [-] 15314 +  |    /go/src/github.com/cockroachdb/cockroach/vendor/github.com/prometheus/client_golang/prometheus/promauto/auto.go:265
E220203 23:36:17.617243 1 1@util/log/logcrash/crash_reporting.go:174  [-] 15314 +  | github.com/cockroachdb/cockroach/pkg/workload/tpcc.setupTPCCMetrics
E220203 23:36:17.617243 1 1@util/log/logcrash/crash_reporting.go:174  [-] 15314 +  |    /go/src/github.com/cockroachdb/cockroach/pkg/workload/tpcc/worker.go:100
E220203 23:36:17.617243 1 1@util/log/logcrash/crash_reporting.go:174  [-] 15314 +  | github.com/cockroachdb/cockroach/pkg/workload/tpcc.(*tpcc).Ops
E220203 23:36:17.617243 1 1@util/log/logcrash/crash_reporting.go:174  [-] 15314 +  |    /go/src/github.com/cockroachdb/cockroach/pkg/workload/tpcc/tpcc.go:741
E220203 23:36:17.617243 1 1@util/log/logcrash/crash_reporting.go:174  [-] 15314 +  | github.com/cockroachdb/cockroach/pkg/workload/cli.runRun.func2
E220203 23:36:17.617243 1 1@util/log/logcrash/crash_reporting.go:174  [-] 15314 +  |    /go/src/github.com/cockroachdb/cockroach/pkg/workload/cli/run.go:425
E220203 23:36:17.617243 1 1@util/log/logcrash/crash_reporting.go:174  [-] 15314 +  | github.com/cockroachdb/cockroach/pkg/workload/cli.runRun
E220203 23:36:17.617243 1 1@util/log/logcrash/crash_reporting.go:174  [-] 15314 +  |    /go/src/github.com/cockroachdb/cockroach/pkg/workload/cli/run.go:442
E220203 23:36:17.617243 1 1@util/log/logcrash/crash_reporting.go:174  [-] 15314 +  | github.com/cockroachdb/cockroach/pkg/workload/cli.CmdHelper.func1
E220203 23:36:17.617243 1 1@util/log/logcrash/crash_reporting.go:174  [-] 15314 +  |    /go/src/github.com/cockroachdb/cockroach/pkg/workload/cli/run.go:224
E220203 23:36:17.617243 1 1@util/log/logcrash/crash_reporting.go:174  [-] 15314 +  | github.com/cockroachdb/cockroach/pkg/workload/cli.HandleErrs.func1
E220203 23:36:17.617243 1 1@util/log/logcrash/crash_reporting.go:174  [-] 15314 +  |    /go/src/github.com/cockroachdb/cockroach/pkg/workload/cli/cli.go:87
E220203 23:36:17.617243 1 1@util/log/logcrash/crash_reporting.go:174  [-] 15314 +  | github.com/spf13/cobra.(*Command).execute
E220203 23:36:17.617243 1 1@util/log/logcrash/crash_reporting.go:174  [-] 15314 +  |    /go/src/github.com/cockroachdb/cockroach/vendor/github.com/spf13/cobra/command.go:860
E220203 23:36:17.617243 1 1@util/log/logcrash/crash_reporting.go:174  [-] 15314 +  | github.com/spf13/cobra.(*Command).ExecuteC
E220203 23:36:17.617243 1 1@util/log/logcrash/crash_reporting.go:174  [-] 15314 +  |    /go/src/github.com/cockroachdb/cockroach/vendor/github.com/spf13/cobra/command.go:974
E220203 23:36:17.617243 1 1@util/log/logcrash/crash_reporting.go:174  [-] 15314 +  | github.com/spf13/cobra.(*Command).Execute
E220203 23:36:17.617243 1 1@util/log/logcrash/crash_reporting.go:174  [-] 15314 +  |    /go/src/github.com/cockroachdb/cockroach/vendor/github.com/spf13/cobra/command.go:902
E220203 23:36:17.617243 1 1@util/log/logcrash/crash_reporting.go:174  [-] 15314 +  | github.com/cockroachdb/cockroach/pkg/cli.Run
E220203 23:36:17.617243 1 1@util/log/logcrash/crash_reporting.go:174  [-] 15314 +  |    /go/src/github.com/cockroachdb/cockroach/pkg/cli/cli.go:295
E220203 23:36:17.617243 1 1@util/log/logcrash/crash_reporting.go:174  [-] 15314 +  | github.com/cockroachdb/cockroach/pkg/cli.doMain
E220203 23:36:17.617243 1 1@util/log/logcrash/crash_reporting.go:174  [-] 15314 +  |    /go/src/github.com/cockroachdb/cockroach/pkg/cli/cli.go:137
E220203 23:36:17.617243 1 1@util/log/logcrash/crash_reporting.go:174  [-] 15314 +  | github.com/cockroachdb/cockroach/pkg/cli.Main
E220203 23:36:17.617243 1 1@util/log/logcrash/crash_reporting.go:174  [-] 15314 +  |    /go/src/github.com/cockroachdb/cockroach/pkg/cli/cli.go:64
E220203 23:36:17.617243 1 1@util/log/logcrash/crash_reporting.go:174  [-] 15314 +  | main.main
E220203 23:36:17.617243 1 1@util/log/logcrash/crash_reporting.go:174  [-] 15314 +  |    /go/src/github.com/cockroachdb/cockroach/pkg/cmd/cockroach/main.go:26

Jira issue: CRDB-13685

Metadata

Metadata

Assignees

No one assigned

    Labels

    A-testingTesting tools and infrastructureC-bugCode not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior.O-cloudreportOriginated from CloudReportT-testengTestEng Team

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions