Skip to content

ARROW-15705: [JavaScript] Allowing appending null on children in a StructBuilder#12451

Closed
Alfred-Mountfield wants to merge 6 commits intoapache:masterfrom
Alfred-Mountfield:js-struct-initialisation
Closed

ARROW-15705: [JavaScript] Allowing appending null on children in a StructBuilder#12451
Alfred-Mountfield wants to merge 6 commits intoapache:masterfrom
Alfred-Mountfield:js-struct-initialisation

Conversation

@Alfred-Mountfield
Copy link
Copy Markdown
Contributor

When trying to add a null value to a StructBuilder, the code previously only modified the null-bitmap.
I believe this breaks the spec as it results in the child arrays being a different length to the Struct array (and the child arrays should have the null value appended to their values).

@github-actions
Copy link
Copy Markdown

Thanks for opening a pull request!

If this is not a minor PR. Could you open an issue for this pull request on JIRA? https://issues.apache.org/jira/browse/ARROW

Opening JIRAs ahead of time contributes to the Openness of the Apache Arrow project.

Then could you also rename pull request title in the following format?

ARROW-${JIRA_ID}: [${COMPONENT}] ${SUMMARY}

or

MINOR: [${COMPONENT}] ${SUMMARY}

See also:

@Alfred-Mountfield Alfred-Mountfield changed the title [ARROW-15705] Allowing appending null on children in a StructBuilder ARROW-15705: [JavaScript] Allowing appending null on children in a StructBuilder Feb 17, 2022
@github-actions
Copy link
Copy Markdown

@github-actions
Copy link
Copy Markdown

⚠️ Ticket has not been started in JIRA, please click 'Start Progress'.

@emkornfield
Copy link
Copy Markdown
Contributor

CC @domoritz

@domoritz domoritz requested a review from trxcllnt February 18, 2022 17:30
Copy link
Copy Markdown
Contributor

@trxcllnt trxcllnt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a small suggestion for how to do this cleaner.

The spec is ambiguous here, but I agree the spec should be updated to enforce this. We'll likely need to update the Map and SparseUnion builders too.

@trxcllnt
Copy link
Copy Markdown
Contributor

@Alfred-Mountfield looks good to me. Thanks!

@trxcllnt
Copy link
Copy Markdown
Contributor

@Alfred-Mountfield I think we have the same issue in the UnionBuilder (but only when the type is SparseUnion). We can either add an override to that builder in this PR, or we can do it in a follow up PR.

@Alfred-Mountfield
Copy link
Copy Markdown
Contributor Author

add an override to that builder in this PR

@trxcllnt If it's the same fix then I'm happy to add the changes at some point tomorrow, otherwise we could just do it in the follow-up as you say

@Alfred-Mountfield
Copy link
Copy Markdown
Contributor Author

With regards to changing it in the UnionBuilder I think it'd be best to do it in another PR to avoid polluting this one and the associated Jira issue. Also, I had code locally that exposed this problem so I was able to test these changes fixed those cases. I don't have the same for the UnionBuilder.

It's probably worth making another Jira issue for that and maybe removing the ambiguity in the specification as well.

Copy link
Copy Markdown
Member

@domoritz domoritz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the pull request. Since you have a local test case, please add a unit test so we can ensure not to repeat this issue.

@Alfred-Mountfield
Copy link
Copy Markdown
Contributor Author

Since you have a local test case, please add a unit test so we can ensure not to repeat this issue.

@domoritz
The local code that exposes the issue is a system that uses both the Rust Arrow implementation and the JavaScript implementation. I actually am not 100% sure how to write a unit-test for this within this repository.

The problem goes a bit deeper, as I've raised in a different Jira ticket, and I'm afraid I'm nowhere near familiar enough with the implementation within this repository to be able to contribute effectively at this point. I have found it difficult to understand the correct way of doing things with this package, as the docs are less beginner-friendly than the other language implementations, and the lack of examples exacerbates this.

I'm open to trying to contribute further, but I do not know how much time I can allocate to learning the specifics of this implementation (hence me missing the purposes of the setValid method). I did have a look around the testing for the builders and such but there seems to be a rather complex technique to generating test cases which would take a while to understand how to expand it to cover these quite specific edge-cases.

@domoritz
Copy link
Copy Markdown
Member

I totally understand. Thank you for sending the pull request and explaining your background.

I can help with adding the test code if you can send a code snippet and describe the expectation before and after the fix. Would that work?

@Alfred-Mountfield
Copy link
Copy Markdown
Contributor Author

That'd be great, thank you :)

I believe the code snippet in that last Jira I sent should probably be about right.

I believe the problem is that when a struct vector is made (either through the Builder or through the makeVector methods), the underlying metadata is incorrect when you have a null struct within that vector. In the simplest case, a vector of a single null struct, with one field, e.g.

let vec = makeVector({
    data: [null],
    type: struct_field.type,
    nullable: true,
});

or

let builder = new Builder({
    type: struct_field.type,
    nullValues: [null, undefined],
});
builder.append(null);
let vec = builder.toVector();

Then the children, nullCount, and lengths are not what I'd expect them to be. I believe (although I'm still not 100% sure I'm reading the spec correctly), that all three of those should be equal to 1.

Where the underlying memory layout will have a null-bitmap buffer of 00000001, and one child array (which in also has a null value, as shown in the specification)

As the code-snippet (and StackBlitz) shows, all three of those methods seem to have differing values and I'm not really sure why.

It's possible that those three values are not adequate for testing that it is correct though, we might need to check that the underlying buffers are initialised for more complex field types on the struct, etc.

@domoritz
Copy link
Copy Markdown
Member

I'm having a hard time finding the time right now. If someone else can write the test case, that would be great. Otherwise I'll do it eventually.

@domoritz
Copy link
Copy Markdown
Member

domoritz commented Mar 2, 2022

I agree that the behavior right now is odd. My replication is at https://stackblitz.com/edit/node-sep8dg?file=index.js. I seem to be getting the same results in this branch, though.

$ node --experimental-vm-modules ./test.js
Builder
[ null ]
1
1
1
makeVector
[]
0
1
0

Can you show me a code snippet that produces different results @Alfred-Mountfield?

@Alfred-Mountfield
Copy link
Copy Markdown
Contributor Author

Can you show me a code snippet that produces different results @Alfred-Mountfield?

I apologise, I seem to have conflated two issues and accidentally implied that this fixes both. That's what I meant by:

The problem goes a bit deeper, as I've raised in a different Jira ticket

The fix I've pushed in this branch stops an error when reading the Array back in Rust. I believe there's an assert within the rust-rs code that checks the children's lengths are equal to the Struct node itself. I hadn't checked to see if it actually changed the values of the latter snippet, but I'm sad to hear it doesn't :( Seems like there's another underlying issue.

I happened to notice the discrepancies with the Builder and makeVector methods when trying to debug and check if we were building things incorrectly in the JavaScript side of our app.

@domoritz
Copy link
Copy Markdown
Member

domoritz commented Mar 2, 2022

Okay. We can't use the rust code in the unit tests so we need an example that changes behaves differently in this branch. Can you write one for this pull request so we can make it into a unit test?

@Alfred-Mountfield
Copy link
Copy Markdown
Contributor Author

Apologies for the late response. Thank you for pushing for more concrete proof that this PR achieves its intention.

While trying to create a better test-case for you I realise we have a large quantity of errors within our uses of the library that have only been exposed to me while trying to create a test-case within the arrow project itself in an IDE with proper intellisense.

I think the code examples I have given (i.e. the StackBlitz links) aren't valid due to errors such as these:
image
image
image

Unfortunately, using the JS Arrow packages don't really give any indication that doing things like that are incorrect. Not trying to restate the obvious benefits of using TypeScript, but I do worry a bit that the arrow package lends itself to so many errors that fail somewhat silently in plain JS because of things like abstract classes. I think having better documentation showing more use-cases and trying to convey convention would help a lot with this. Is there somewhere I can raise this properly (as this PR doesn't feel like the right place to start that discussion), should I just open a Jira ticket or will that likely languish?

Going back to this PR, I'm going to investigate the methods a bit more deeply, and try and understand the library's implementation a bit better when it comes to things like setValid etc. I worry that the spec is a little too vague when it comes to the specifics of the lengths of buffers in things like fixedSizeLists and Structs with null elements, and it's possible the different language impls have conflicting interpretations. I'll do my investigation in TS (rather than JS as I was before) and post again afterwards :)

@domoritz
Copy link
Copy Markdown
Member

domoritz commented Apr 6, 2022

I think having better documentation showing more use-cases and trying to convey convention would help a lot with this. Is there somewhere I can raise this properly (as this PR doesn't feel like the right place to start that discussion), should I just open a Jira ticket or will that likely languish?

Jira is a good place. Maybe https://issues.apache.org/jira/browse/ARROW-16039 already covers what you describe. And yes, we need more examples and would love help on that.

@Alfred-Mountfield Can you make some time to wrap up this pull request so we can merge it before the next release?

@Alfred-Mountfield Alfred-Mountfield force-pushed the js-struct-initialisation branch from 0a90ba2 to 5f7e2ac Compare April 6, 2022 18:03
@Alfred-Mountfield
Copy link
Copy Markdown
Contributor Author

Apologies @domoritz on such a late reply, this dropped off my radar as I completely moved away from Arrow related stuff for a while.

I've tried recreating it with the following:

describe ('StructBuilder', () => {
    test('Appending Null', () => {
        let list_field = new Field('n1', new FixedSizeList(2, new Field('item', new Float64())), true);
        let struct_field = new Field('foo', new Struct([list_field]), true);

        let builder = makeBuilder({
            type: struct_field.type,
            nullValues: [null, undefined],
        });

        builder.append(null);
        console.log('Struct numChildren: ' + builder.numChildren);
        console.log('Struct nullCount: ' + builder.nullCount);
        console.log('Struct length: ' + builder.length);
        console.log('Child numChildren: ' + builder.children[0].numChildren);
        console.log('Child nullCount: ' + builder.children[0].nullCount);
        console.log('Child length: ' + builder.children[0].length);
    }
}

When ran with the change made in this PR:

    Struct numChildren: 1
    Struct nullCount: 1
    Struct length: 1
    Child numChildren: 1
    Child nullCount: 1
    Child length: 1

and when ran without the change:

    Struct numChildren: 1
    Struct nullCount: 1
    Struct length: 1
    Child numChildren: 1
    Child nullCount: 0
    Child length: 0

As you can see, the null-count and length of the child is 0, whereas the spec seems to suggest the Children arrays should be the same length as the Struct array.

I tried extending my testing with makeVector but I wasn't quite able to figure out the APIs in terms of how I'd represent the null data. I'm assuming using the export function makeData<T extends Null>(props: NullDataProps<T>): Data<T>; interface but I don't quite have capacity to investigate.

As we discussed above, if these changes need to be reflected in the UnionBuilder or others, it'd be best in another PR.

Hopefully the simple test I gave above helps with testing those, specifically we're interested in checking that children of the builders correctly have their setValid methods called and thus have the correct nullCount and length.

@domoritz
Copy link
Copy Markdown
Member

I added a test in #12859. Feel free to merge it into this pull request so we can merge this instead of mine.

@Alfred-Mountfield
Copy link
Copy Markdown
Contributor Author

Apologies for yet another late reply, have been away from my computer for a week on holiday. I've merged your branch in @domoritz.

@domoritz domoritz closed this in fae66cb Apr 18, 2022
@ursabot
Copy link
Copy Markdown

ursabot commented Apr 18, 2022

Benchmark runs are scheduled for baseline = e0668e2 and contender = fae66cb. fae66cb is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
Conbench compare runs links:
[Finished ⬇️0.0% ⬆️0.0%] ec2-t3-xlarge-us-east-2
[Failed ⬇️0.59% ⬆️0.0%] test-mac-arm
[Failed ⬇️0.36% ⬆️0.0%] ursa-i9-9960x
[Finished ⬇️0.38% ⬆️0.0%] ursa-thinkcentre-m75q
Buildkite builds:
[Finished] <https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ec2-t3-xlarge-us-east-2/builds/525| fae66cba ec2-t3-xlarge-us-east-2>
[Finished] <https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-test-mac-arm/builds/512| fae66cba test-mac-arm>
[Failed] <https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ursa-i9-9960x/builds/511| fae66cba ursa-i9-9960x>
[Finished] <https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ursa-thinkcentre-m75q/builds/522| fae66cba ursa-thinkcentre-m75q>
[Finished] <https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ec2-t3-xlarge-us-east-2/builds/524| e0668e20 ec2-t3-xlarge-us-east-2>
[Failed] <https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-test-mac-arm/builds/511| e0668e20 test-mac-arm>
[Failed] <https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ursa-i9-9960x/builds/510| e0668e20 ursa-i9-9960x>
[Finished] <https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ursa-thinkcentre-m75q/builds/521| e0668e20 ursa-thinkcentre-m75q>
Supported benchmarks:
ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python, R. Runs only benchmarks with cloud = True
test-mac-arm: Supported benchmark langs: C++, Python, R
ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants