Skip to content

Data race in GetColumnPageReader #342

@daniel-adam-tfs

Description

@daniel-adam-tfs

Describe the bug, including details regarding any error messages, version, and platform.

Is the reading of columns by a single rgReader from multiple Go routines supported?

We're reading four columns from a single row group concurrently. It worked fine in v18.1, but with v18.2 (and this commit: 6dc6926) there is now a data race on the assignment/read of r.rgPageIndexReader.

Trace from the race detector:

WARNING: DATA RACE

Read at 0x00c000af6040 by goroutine 9596:
  github.com/apache/arrow-go/v18/parquet/file.(*RowGroupReader).GetColumnPageReader()
      /go/pkg/mod/github.com/apache/arrow-go/v18@v18.2.0/parquet/file/row_group_reader.go:131 +0x1c39
  github.com/apache/arrow-go/v18/parquet/file.(*RowGroupReader).Column()
      /go/pkg/mod/github.com/apache/arrow-go/v18@v18.2.0/parquet/file/row_group_reader.go:70 +0x296
  github.com/thermofisher/cmd/research/parqstore/internal/parquetutil.ReadParquetColumnIndexBatch[go.shape.[]int32,go.shape.int32]()
      /go/src/parqstore/internal/parquetutil/parquetutil.go:205 +0x73
  github.com/thermofisher/cmd/research/parqstore/internal/chromatogram.(*centroidReader).readSimpleCentroidRowGroup.func1()
      /go/src/parqstore/internal/chromatogram/centroidreader.go:251 +0x104


Previous write at 0x00c000af6040 by goroutine 9597:
  github.com/apache/arrow-go/v18/parquet/file.(*RowGroupReader).GetColumnPageReader()
      /go/pkg/mod/github.com/apache/arrow-go/v18@v18.2.0/parquet/file/row_group_reader.go:94 +0x1a4
  github.com/apache/arrow-go/v18/parquet/file.(*RowGroupReader).Column()
      /go/pkg/mod/github.com/apache/arrow-go/v18@v18.2.0/parquet/file/row_group_reader.go:70 +0x296
  github.com/thermofisher/cmd/research/parqstore/internal/parquetutil.ReadParquetColumnIndexBatch[go.shape.[]float64,go.shape.float64]()
      /go/src/parqstore/internal/parquetutil/parquetutil.go:205 +0x73
  github.com/thermofisher/cmd/research/parqstore/internal/chromatogram.(*centroidReader).readSimpleCentroidRowGroup.func2()
      /go/src/parqstore/internal/chromatogram/centroidreader.go:263 +0x106

Component(s)

Parquet

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type: bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions