Skip to content

fix: list_columns.parquet testing#378

Merged
zeroshade merged 2 commits intoapache:mainfrom
MetalBlueberry:vperez/add-list-columns-parquet-testing
May 20, 2025
Merged

fix: list_columns.parquet testing#378
zeroshade merged 2 commits intoapache:mainfrom
MetalBlueberry:vperez/add-list-columns-parquet-testing

Conversation

@MetalBlueberry
Copy link
Copy Markdown
Contributor

@MetalBlueberry MetalBlueberry commented May 14, 2025

Rationale for this change

add list_columns.parquet testing. rr.Next() is returning true when we've already reached the end of the file. This leads to a panic later on

What changes are included in this PR?

  • Test that proves list_columns.parquet is not working as expected.
  • Use known path for partquet_testing if var is not defined. This ensures test can be run as usual but still enables custom locations for parquet_test_data. It also guarantees failure if not run correctly.

Are these changes tested?

That is the goal of the PR

Are there any user-facing changes?

no?

@MetalBlueberry MetalBlueberry force-pushed the vperez/add-list-columns-parquet-testing branch from f94cd32 to 68b7e59 Compare May 14, 2025 08:51
require.Equal(t, records[j][i], "")
continue
}
require.Equal(t, records[j][i], vals.ValueStr(j))
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It panics here

@MetalBlueberry
Copy link
Copy Markdown
Contributor Author

I found a change that may be the root cause for this issue.
https://github.com/apache/arrow-go/pull/321/files#diff-d0e2a6bea8e483a05edf9c33af4b5cf404b4521349c529856751eca721f0593bR95

Calling ResetValues is not the same as Reset. Which seems to leave the recordReader in the wrong state

record reader needs to move forward after reading data to not attempt to read the same portion.
@MetalBlueberry MetalBlueberry marked this pull request as ready for review May 20, 2025 09:00
@MetalBlueberry MetalBlueberry requested a review from zeroshade as a code owner May 20, 2025 09:00
@MetalBlueberry MetalBlueberry changed the title fix: list columns parquet testing fix: list_columns.parquet testing May 20, 2025
@zeroshade zeroshade merged commit 60e36a4 into apache:main May 20, 2025
23 checks passed
zeroshade pushed a commit that referenced this pull request Jun 17, 2025
add list_columns.parquet testing. `rr.Next()` is returning true when
we've already reached the end of the file. This leads to a panic later
on

- Test that proves list_columns.parquet is not working as expected.
- Use known path for partquet_testing if var is not defined. This
ensures test can be run as usual but still enables custom locations for
parquet_test_data. It also guarantees failure if not run correctly.

That is the goal of the PR

no?

(cherry picked from commit 60e36a4)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants