Skip to content

Conversation

@rouault
Copy link
Contributor

@rouault rouault commented May 8, 2024

Those 2 files triggered libparquet c++ issues apache/arrow#41317 and apache/arrow#41321 . They have been generated through a local run of oss-fuzz on synthetic test data of the GDAL regression test suite, and can be licensed under Apache-2

@mapleFU mapleFU requested review from pitrou and wgtmac August 13, 2024 10:25
@mapleFU mapleFU changed the title Add 2 corrupted files Add corrupted files in bad_data Aug 13, 2024
Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense to me. Thank you @rouault and @mapleFU

I think we may need a parquet reviewer to approve / merge this PR as well


* PARQUET-1481.parquet: tests a case where a schema Thrift value has been
corrupted
* arrow_issue_41321.parquet: test case of https://github.com/apache/arrow/issues/41321
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe try to unify the file naming in this directory? We already have PARQUET-1481.parquet (a JIRA reference) so perhaps something like ARROW-GH-41321.parquet? (related: #57)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me edit it 🤔

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@mapleFU
Copy link
Member

mapleFU commented Aug 14, 2024

I would merge this in 1 day if no negative comment. 7.1k is a bit large here but it's not too large since generate a file like this is also hard.

@mapleFU mapleFU requested a review from pitrou August 15, 2024 02:55
@mapleFU mapleFU merged commit 89ec47e into apache:master Aug 15, 2024
@mapleFU
Copy link
Member

mapleFU commented Aug 15, 2024

Merged! Thanks all!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants