Skip to content

[Bug]: Go SDK Dataflow jobs fail on DataSampling disabled #29760

@lostluck

Description

@lostluck

What happened?

Dataflow is making DataSampling FnAPI requests even when DataSampling is disabled. But since the feature wasn't enabled, the Go SDK isn't initialzing the datasampler, leading to nil pointer panic.

2023-12-08 16:28:11.597 PST
panic({0x1264220?, 0x2786ff0?}) 
2023-12-08 16:28:11.597 PST
	runtime/panic.go:914 +0x21f 
2023-12-08 16:28:11.597 PST
github.com/apache/beam/sdks/v2/go/pkg/beam/core/runtime/exec.(*DataSampler).getAllSamples(0x0) 
2023-12-08 16:28:11.597 PST
	github.com/apache/beam/sdks/v2/go/pkg/beam/core/runtime/exec/datasampler.go:79 +0x4b 
2023-12-08 16:28:11.597 PST
github.com/apache/beam/sdks/v2/go/pkg/beam/core/runtime/exec.(*DataSampler).GetSamples(0x1312c00?, {0x0?, 0xc000234480?, 0xc00053b508?}) 
2023-12-08 16:28:11.597 PST
	github.com/apache/beam/sdks/v2/go/pkg/beam/core/runtime/exec/datasampler.go:62 +0x1d 
2023-12-08 16:28:11.597 PST
github.com/apache/beam/sdks/v2/go/pkg/beam/core/runtime/harness.(*control).handleInstruction(0xc000318000, {0x180beb0?, 0xc000091c80?}, 0xc000147ef0) 
2023-12-08 16:28:11.597 PST
	github.com/apache/beam/sdks/v2/go/pkg/beam/core/runtime/harness/harness.go:668 +0x116f 
2023-12-08 16:28:11.597 PST
github.com/apache/beam/sdks/v2/go/pkg/beam/core/runtime/harness.MainWithOptions.func4({0x180beb0?, 0xc000091c80?}, 0xc000091c80?) 
2023-12-08 16:28:11.597 PST
	github.com/apache/beam/sdks/v2/go/pkg/beam/core/runtime/harness/harness.go:202 +0x74 
2023-12-08 16:28:11.597 PST
github.com/apache/beam/sdks/v2/go/pkg/beam/core/runtime/harness.MainWithOptions({0x180beb0, 0xc000091c20}, {0x7fffd16304e4, 0xf}, {0x7fffd1630507, 0xf}, {{0xc000225e70, 0x1, 0x1}, {0xc000046010, ...}}) 
2023-12-08 16:28:11.597 PST
	github.com/apache/beam/sdks/v2/go/pkg/beam/core/runtime/harness/harness.go:222 +0x1022 
2023-12-08 16:28:11.597 PST
github.com/apache/beam/sdks/v2/go/pkg/beam/core/runtime/harness/init.hook() 
2023-12-08 16:28:11.597 PST
	github.com/apache/beam/sdks/v2/go/pkg/beam/core/runtime/harness/init/init.go:144 +0x50c 

Easy enough fix, not caught sooner because we didn't run the Dataflow Go Postcommits.

First: Fix the issue, and cherry pick it into 2.53.0
Second: While this is very unlikely, this would have been caught by a simple Dataflow Go Wordcount test as a pre-commit. I'll add that.

Issue Priority

Priority: 1 (data loss / total loss of function)

Issue Components

  • Component: Python SDK
  • Component: Java SDK
  • Component: Go SDK
  • Component: Typescript SDK
  • Component: IO connector
  • Component: Beam YAML
  • Component: Beam examples
  • Component: Beam playground
  • Component: Beam katas
  • Component: Website
  • Component: Spark Runner
  • Component: Flink Runner
  • Component: Samza Runner
  • Component: Twister2 Runner
  • Component: Hazelcast Jet Runner
  • Component: Google Cloud Dataflow Runner

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions