Skip to content

Conversation

@norberttech
Copy link
Member

@norberttech norberttech commented Oct 25, 2023

Change Log

Added

  • Possibility to iterate through all parquet file column chunk page headers

Fixed

Changed

  • PHPUnit code coverage thresholds

Removed

Deprecated

Security


Description

Ref: #575

Even though this is not mandatory for the reader itself, I'm going to use it later for parquet viewer CLI app, for now I'm using this to quickly understand parquet files.

This is how the viewer might look like in the future:
image

\fseek($stream, $pageOffset);
$thriftHeader = new \Flow\Parquet\Thrift\PageHeader();
$thriftHeader->read(new TCompactProtocol(new TBufferedTransport(new TPhpFileStream($stream))));
@$thriftHeader->read(new TCompactProtocol(new TBufferedTransport(new TPhpFileStream($stream))));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What error you have silenced here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

internal thrift warning about invalid array offset, unrelated for the logic, annoying in the CLI

@norberttech norberttech merged commit d67c235 into flow-php:1.x Oct 25, 2023
Comment on lines +59 to +60
\strlen($pageContainer->pageDataBuffer) + \strlen($pageContainer->pageHeaderBuffer),
\strlen($pageContainer->pageDataBuffer) + \strlen($pageContainer->pageHeaderBuffer),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As this is repeated, it can be extracted.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a temporary hack. One is compressed, another uncompressed size, once I move to compressions this will be changed, but I'm leaving compressions for the end, now I'm going to tackle encodings

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants