-
Notifications
You must be signed in to change notification settings - Fork 4.1k
[Go][Parquet] panic when writing particular dataset encoded by DeltaBitPacked #37102
Copy link
Copy link
Closed
Closed
Copy link
Description
Describe the bug, including details regarding any error messages, version, and platform.
version: v12.0.1
here is the testcase which cause panic:
parquet/internal/encoding/encoding_test.go
func TestWriteDeltaBitPackedInt64(t *testing.T) {
column := schema.NewColumn(schema.NewInt64Node("int64", parquet.Repetitions.Required, -1), 0, 0)
tests := []struct {
name string
toencode []int64
expected []byte
}{
{"panic data", []int64{
0, 3000000000000000000, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 3000000000000000000, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 3000000000000000000, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 3000000000000000000, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0,
}, []byte{0, 0, 0, 0, 0}}, // ignore expected bytes
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
enc := encoding.NewEncoder(parquet.Types.Int64, parquet.Encodings.DeltaBinaryPacked, false, column, memory.DefaultAllocator)
enc.(encoding.Int64Encoder).Put(tt.toencode)
buf, _ := enc.FlushValues() // <------------- panic on this line
defer buf.Release()
assert.Equal(t, tt.expected, buf.Bytes())
dec := encoding.NewDecoder(parquet.Types.Int64, parquet.Encodings.DeltaBinaryPacked, column, memory.DefaultAllocator)
dec.(encoding.Int64Decoder).SetData(len(tt.toencode), buf.Bytes())
out := make([]int64, len(tt.toencode))
dec.(encoding.Int64Decoder).Decode(out)
assert.Equal(t, tt.toencode, out)
})
}
// other subtests...
}what I expect:
no panic and data has been packed into bytes array
what I actually get:
Running tool: /usr/local/go/bin/go test -timeout 30s -run ^TestWriteDeltaBitPackedInt64$ github.com/apache/arrow/go/v13/parquet/internal/encoding -v
=== RUN TestWriteDeltaBitPackedInt64
=== RUN TestWriteDeltaBitPackedInt64/panic_data
--- FAIL: TestWriteDeltaBitPackedInt64 (0.00s)
--- FAIL: TestWriteDeltaBitPackedInt64/panic_data (0.00s)
panic: runtime error: slice bounds out of range [:1026] with capacity 1024 [recovered]
panic: runtime error: slice bounds out of range [:1026] with capacity 1024
goroutine 19 [running]:
testing.tRunner.func1.2({0x103b6c000, 0x1400019c618})
/usr/local/go/src/testing/testing.go:1526 +0x1c8
testing.tRunner.func1()
/usr/local/go/src/testing/testing.go:1529 +0x364
panic({0x103b6c000, 0x1400019c618})
/usr/local/go/src/runtime/panic.go:884 +0x1f4
github.com/apache/arrow/go/v13/parquet/internal/encoding.(*deltaBitPackEncoder).FlushValues(0x140001eec00)
/Users/illyrix/Workspace/arrow/go/parquet/internal/encoding/delta_bit_packing.go:457 +0x33c
github.com/apache/arrow/go/v13/parquet/internal/encoding_test.TestWriteDeltaBitPackedInt64.func1(0x0?)
/Users/illyrix/Workspace/arrow/go/parquet/internal/encoding/encoding_test.go:642 +0xac
testing.tRunner(0x14000185ba0, 0x14000401680)
/usr/local/go/src/testing/testing.go:1576 +0x104
created by testing.(*T).Run
/usr/local/go/src/testing/testing.go:1629 +0x370
FAIL github.com/apache/arrow/go/v13/parquet/internal/encoding 0.976s
there is another issue about delta_bit_packing( #35718 ), but it may be a different bug. All values in our test case are non-null.
Component(s)
Go, Parquet
Reactions are currently unavailable