Skip to content

ARROW-3256,3304: [JS] fix file footer inconsistency, yield all messages from the stream reader#2616

Closed
trxcllnt wants to merge 3 commits intoapache:masterfrom
trxcllnt:js-stream-reader-fixes-rebased
Closed

ARROW-3256,3304: [JS] fix file footer inconsistency, yield all messages from the stream reader#2616
trxcllnt wants to merge 3 commits intoapache:masterfrom
trxcllnt:js-stream-reader-fixes-rebased

Conversation

@trxcllnt
Copy link
Contributor

Yield all messages from the stream reader to support reading multiple tables from the same source stream. This is something a stream-reader refactor should support out of the box, but this PR adds this ability to master until that refactor is done. I added an integration test that shows what such a reader would look like today.

@trxcllnt trxcllnt changed the title ARROW-3304: [JS]: yield all messages emit all messages from the stream reader ARROW-3304: [JS] yield all messages emit all messages from the stream reader Sep 24, 2018
@trxcllnt trxcllnt changed the title ARROW-3304: [JS] yield all messages emit all messages from the stream reader ARROW-3304: [JS] yield all messages from the stream reader Sep 24, 2018
Copy link
Member

@wesm wesm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The metadata changes look good to me

constructor(version: MetadataVersion, length: Long | number, nodes: FieldMetadata[], buffers: BufferMetadata[], bodyLength?: Long | number) {
if (bodyLength === void(0)) {
bodyLength = buffers.reduce((s, b) => align(s + b.length + (b.offset - s), 8), 0);
bodyLength = buffers.reduce((bodyLength, buffer) => bodyLength + buffer.length, 0);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are your buffer lengths here guaranteed to a multiple of 8? Otherwise should round up

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, this size is rounded up in the writer here

@wesm
Copy link
Member

wesm commented Sep 26, 2018

Cool, @TheNeuralBit does this look good?

@TheNeuralBit TheNeuralBit changed the title ARROW-3304: [JS] yield all messages from the stream reader ARROW-3256,3304: [JS] fix file footer inconsistency, yield all messages from the stream reader Sep 26, 2018
@TheNeuralBit
Copy link
Member

👍 LGTM

@wesm wesm closed this in c1cf985 Sep 27, 2018
trxcllnt added a commit that referenced this pull request Sep 27, 2018
Fixes https://issues.apache.org/jira/browse/ARROW-3336. Realized too late I branched from the #2616, so that's why there's some extra commits at the front.

Check out the relevant tests here: https://github.com/trxcllnt/arrow/blob/860f61046fca0081dbe0ce986a97408ed5934e22/js/test/unit/writer-tests.ts#L26. This test covers the primitive, nested, and dictionary vectors, but it'd be worth adding more (for example, Unions).

Author: ptaylor <paul.e.taylor@me.com>

Closes #2638 from trxcllnt/js-fix-writing-sliced-arrays and squashes the following commits:

bcd58ca <ptaylor> fix writer-tests.ts import
fa4d8f5 <ptaylor> only use vector offset to serialize bitmaps and data buffers of variable-width vectors, as all other buffers have already been adjusted as part of the original call to Vector#slice()
5ece080 <ptaylor> reset internal children Array in case the childData for each has been sliced
e093f7c <ptaylor> add test validating writer serializes sliced arrays correctly
f9dd91b <ptaylor> move vector comparison extension into separate module
380d655 <ptaylor> fix table constructor args
kou pushed a commit to apache/arrow-js that referenced this pull request May 14, 2025
Fixes https://issues.apache.org/jira/browse/ARROW-3336. Realized too late I branched from the apache/arrow#2616, so that's why there's some extra commits at the front.

Check out the relevant tests here: https://github.com/trxcllnt/arrow/blob/860f61046fca0081dbe0ce986a97408ed5934e22/js/test/unit/writer-tests.ts#L26. This test covers the primitive, nested, and dictionary vectors, but it'd be worth adding more (for example, Unions).

Author: ptaylor <paul.e.taylor@me.com>

Closes #2638 from trxcllnt/js-fix-writing-sliced-arrays and squashes the following commits:

bcd58caa <ptaylor> fix writer-tests.ts import
fa4d8f54 <ptaylor> only use vector offset to serialize bitmaps and data buffers of variable-width vectors, as all other buffers have already been adjusted as part of the original call to Vector#slice()
5ece0806 <ptaylor> reset internal children Array in case the childData for each has been sliced
e093f7c0 <ptaylor> add test validating writer serializes sliced arrays correctly
f9dd91b0 <ptaylor> move vector comparison extension into separate module
380d6555 <ptaylor> fix table constructor args
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants