Skip to content

Conversation

@BryanCutler
Copy link
Member

The last upgrade of the Jackson JSON library changed behavior to no longer allow reading of "NaN" values by default. This change configures the JSON generator and parser to allow for NaN values (unquoted) alongside standard floating point numbers. A test was added for JSON writing/reading and modified the test for Arrow file and stream .

@icexelloss
Copy link
Contributor

LGTM. Do you want to add "NaN" in integration/data/simple.json?

@BryanCutler
Copy link
Member Author

The background for this is that an existing Spark unit test for Arrow fails with the current master branch due to no longer being able to parse these values. The configuration fixes this and allows for the following JSON:

    {
      "name" : "float",
      "count" : 10,
      "VALIDITY" : [1,1,1,1,1,1,1,1,1,1],
      "DATA" : [NaN,1.0,2.0,3.0,4.0,5.0,6.0,7.0,8.0,9.0]
    }

This is not standard JSON, but is supported by the Jackson library, I don't think it is something to be concerned about though. cc @icexelloss

@BryanCutler
Copy link
Member Author

LGTM. Do you want to add "NaN" in integration/data/simple.json?

Thanks @icexelloss for the quick review! Yeah, I'll try adding this to the integration tests and see if it's handled

@BryanCutler
Copy link
Member Author

It doesn't look like C++ is able to parse the NaN value, but I'm not sure if this is something that we need to be running integration for. @wesm what do you think?

@wesm
Copy link
Member

wesm commented Nov 29, 2017

Hm, I can try to fix the C++ reader. Up to you

@BryanCutler
Copy link
Member Author

Given that the issue was only on the Java side and it's not standard JSON, I would say not to bother with the C++ reader and integration tests for now.

@icexelloss
Copy link
Contributor

LGTM. +1. With or without NaN in integration test.

@wesm wesm force-pushed the java-JsonReader-all_non_numeric-ARROW-1817 branch from 0b7295b to 4c4682a Compare December 1, 2017 16:52
@BryanCutler
Copy link
Member Author

Thanks @icexelloss , I'll try merging this later today if no more comments

Copy link
Member

@wesm wesm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@wesm wesm closed this in ad9105e Dec 1, 2017
@BryanCutler BryanCutler deleted the java-JsonReader-all_non_numeric-ARROW-1817 branch November 19, 2018 05:49
pribor pushed a commit to GlobalWebIndex/arrow that referenced this pull request Oct 24, 2025
…lues

The last upgrade of the Jackson JSON library changed behavior to no longer allow reading of "NaN" values  by default.  This change configures the JSON generator and parser to allow for NaN values (unquoted) alongside standard floating point numbers.  A test was added for JSON writing/reading and modified the test for Arrow file and stream .

Author: Bryan Cutler <cutlerb@gmail.com>

Closes apache#1375 from BryanCutler/java-JsonReader-all_non_numeric-ARROW-1817 and squashes the following commits:

4c4682a [Bryan Cutler] configure JsonWriter to write NaN not as strings, add test for read and write of float with NaN
1fa24f4 [Bryan Cutler] added conf for JacksonParser to allow NaN tokens
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants