fix(parquet): Max target file size not working#16389
fix(parquet): Max target file size not working#16389PingLiuPing wants to merge 2 commits intofacebookincubator:mainfrom
Conversation
✅ Deploy Preview for meta-velox canceled.
|
|
is it related to #15751 fix? |
|
@prashantgolash Could you help take a look at this part?
However, dwrf reader is able to read "non-empty header only file" without error, is this behavior (empty file) intentional by design? |
I earlier thought about this but it worked for DWRF as you find and I realized this is corner case. Looks like for parquet we may need to fix this. Thanks for making the change. I will look into the diff in more detail by tomorrow EOD |
There was a problem hiding this comment.
For case when ensureFiles is true, we will unconditionally create Writer. I think writer close should write footer file but that is not explicit from Writer contract. What will be behavior with parquet writer? Do we need to check for this also in finalizeWriter path to make sure this is explicitly handled.
The property intends to have files. Per comment
// When this option is set the HiveDataSink will always write a file even
// if there's no data. This is useful when the table is bucketed, but the
// engine handles ensuring a 1 to 1 mapping from task to bucket.
There was a problem hiding this comment.
Also I think we need to rebase this on your other diff?
There was a problem hiding this comment.
Thanks @prashantgolash
What will be behavior with parquet writer?
For parquet, it will generate an empty file (0 bytes).
Do we need to check for this also in finalizeWriter path to make sure this is explicitly handled.
I think I understand your point here, I'm trying to figure out if this is valid requirement for parquet. I have a fix for this to make it align with dwrf behaviour, basically I'm passing ensureFiles to writer option and during parquet writer close(), check this option together with other states and then write the metadata to parquet file.
How about land this PR first and I'll submit another PR to fix this if necessary.
| @@ -1049,28 +1032,10 @@ uint32_t HiveDataSink::appendWriter(const HiveWriterId& id) { | |||
| std::move(writerPool), | |||
| std::move(sinkPool), | |||
| std::move(sortPool))); | |||
| // Set current file names for the initial file (sequence 0). | |||
| auto& newInfo = writerInfo_.back(); | |||
| newInfo->currentWriteFileName = newInfo->writerParameters.writeFileName(); | |||
There was a problem hiding this comment.
Does this change the file name? We don't need sequence number in this case
There was a problem hiding this comment.
@Yuhta Thanks for the comment.
When sequence number is 0, makeSequencedFileName returns original file name.
velox/velox/connectors/hive/HiveDataSink.cpp
Lines 48 to 53 in fe064ad
95addf9 to
96b0150
Compare
96b0150 to
fc6e786
Compare
# Conflicts: # velox/connectors/hive/HiveDataSink.cpp
fc6e786 to
fc7e533
Compare
|
@Yuhta Could you help to take another look at your convenience? Thank you. |
Yuhta
left a comment
There was a problem hiding this comment.
Intentional Behavioral Changes
-
No empty trailing files after rotation: After
rotateWriter(), the writer isnullptruntil the nextwrite()call. If no more data arrives for that partition, no file is created. Previously, the next writer/file was created eagerly inrotateWriter(), potentially producing an empty file. -
Empty files excluded from
writtenFilesmetadata:finalizeWriterFile()now skips recordingHiveFileInfowhencurrentFileBytes == 0. Previously, a zero-byte file entry would be included in the commit metadata.
No Unintended Behavioral Changes
| Concern | Analysis | Verdict |
|---|---|---|
| File naming for initial file | makeSequencedFileName(name, 0) returns name unchanged — identical to the old code which used writerParameters.writeFileName() directly |
Safe |
createWriterOptions() vs createWriterOptions(index) |
The no-arg overload is just return createWriterOptions(writerInfo_.size() - 1). In appendWriter(), the writer was just pushed, so size() - 1 equals the new writer's index. Both paths resolve to the same indexed overload with the same index. |
Safe |
| Non-reclaimable guard on lazy writer creation | Old rotateWriter() created the new writer without a non-reclaimable guard. New path creates it inside write() which already holds WRITER_NON_RECLAIMABLE_SECTION_GUARD(index), plus createWriterForIndex() adds its own. The macro creates a local RAII variable (nonReclaimableGuard) — the nested one in createWriterForIndex shadows the outer one but both reference the same nonReclaimableSectionHolder, so this is harmless. This is actually more protective than the old code. |
Safe (improved) |
| Writer creation timing / memory | After rotation, memory allocation for the new writer is deferred until the next write(). This changes when memory is consumed but not the total amount. |
Minor timing shift, no correctness issue |
Summary
The only behavioral changes are the two intentional ones: no empty trailing files after rotation, and no zero-byte entries in commit metadata. The refactoring (createWriterForIndex, maybeCreateBucketSortWriter taking an index) is behavior-preserving.
Previously, writer rotation did not behave as expected when max-target-file-size was set to a value smaller than the default flush policy threshold.
Parquet only updates write stats on row group flush, so small target sizes were ignored.
For example when max-target-file-size < kDefaultBytesInRowGroup, we could not generating files close to the expected size.
We now cap Parquet’s row group byte threshold so stats advance during writes and rotation can happen.
And fixes a side effect where during writer rotation it could leave an empty file (for parquet it is an empty file with 0B size, for dwrf, it is a non-empty header only file).
The root cause is that prior to introducing writer rotation, writers were created strictly in response to actual input. After enabling rotation, closing a full writer would immediately create a new writer, regardless of whether more input was available. If no additional data arrived, it creates an empty file.
This fix makes the rotation lazy: after closing a full file it only advance the file sequence number but wait to open the next writer until new input arrives, which avoids creating an empty file.