What happened?
Dataflow is making DataSampling FnAPI requests even when DataSampling is disabled. But since the feature wasn't enabled, the Go SDK isn't initialzing the datasampler, leading to nil pointer panic.
2023-12-08 16:28:11.597 PST
panic({0x1264220?, 0x2786ff0?})
2023-12-08 16:28:11.597 PST
runtime/panic.go:914 +0x21f
2023-12-08 16:28:11.597 PST
github.com/apache/beam/sdks/v2/go/pkg/beam/core/runtime/exec.(*DataSampler).getAllSamples(0x0)
2023-12-08 16:28:11.597 PST
github.com/apache/beam/sdks/v2/go/pkg/beam/core/runtime/exec/datasampler.go:79 +0x4b
2023-12-08 16:28:11.597 PST
github.com/apache/beam/sdks/v2/go/pkg/beam/core/runtime/exec.(*DataSampler).GetSamples(0x1312c00?, {0x0?, 0xc000234480?, 0xc00053b508?})
2023-12-08 16:28:11.597 PST
github.com/apache/beam/sdks/v2/go/pkg/beam/core/runtime/exec/datasampler.go:62 +0x1d
2023-12-08 16:28:11.597 PST
github.com/apache/beam/sdks/v2/go/pkg/beam/core/runtime/harness.(*control).handleInstruction(0xc000318000, {0x180beb0?, 0xc000091c80?}, 0xc000147ef0)
2023-12-08 16:28:11.597 PST
github.com/apache/beam/sdks/v2/go/pkg/beam/core/runtime/harness/harness.go:668 +0x116f
2023-12-08 16:28:11.597 PST
github.com/apache/beam/sdks/v2/go/pkg/beam/core/runtime/harness.MainWithOptions.func4({0x180beb0?, 0xc000091c80?}, 0xc000091c80?)
2023-12-08 16:28:11.597 PST
github.com/apache/beam/sdks/v2/go/pkg/beam/core/runtime/harness/harness.go:202 +0x74
2023-12-08 16:28:11.597 PST
github.com/apache/beam/sdks/v2/go/pkg/beam/core/runtime/harness.MainWithOptions({0x180beb0, 0xc000091c20}, {0x7fffd16304e4, 0xf}, {0x7fffd1630507, 0xf}, {{0xc000225e70, 0x1, 0x1}, {0xc000046010, ...}})
2023-12-08 16:28:11.597 PST
github.com/apache/beam/sdks/v2/go/pkg/beam/core/runtime/harness/harness.go:222 +0x1022
2023-12-08 16:28:11.597 PST
github.com/apache/beam/sdks/v2/go/pkg/beam/core/runtime/harness/init.hook()
2023-12-08 16:28:11.597 PST
github.com/apache/beam/sdks/v2/go/pkg/beam/core/runtime/harness/init/init.go:144 +0x50c
Easy enough fix, not caught sooner because we didn't run the Dataflow Go Postcommits.
First: Fix the issue, and cherry pick it into 2.53.0
Second: While this is very unlikely, this would have been caught by a simple Dataflow Go Wordcount test as a pre-commit. I'll add that.
Issue Priority
Priority: 1 (data loss / total loss of function)
Issue Components
What happened?
Dataflow is making DataSampling FnAPI requests even when DataSampling is disabled. But since the feature wasn't enabled, the Go SDK isn't initialzing the datasampler, leading to nil pointer panic.
Easy enough fix, not caught sooner because we didn't run the Dataflow Go Postcommits.
First: Fix the issue, and cherry pick it into 2.53.0
Second: While this is very unlikely, this would have been caught by a simple Dataflow Go Wordcount test as a pre-commit. I'll add that.
Issue Priority
Priority: 1 (data loss / total loss of function)
Issue Components