Skip to content

Conversation

@liyafan82
Copy link
Contributor

For variable-width vectors (VarCharVector and VarBinaryVector), when the validity bit is not set, it means the underlying data is null, so the get method should return null.

However, the current implementation throws an IllegalStateException when NULL_CHECKING_ENABLED is set, or returns an empty array when the flag is clear.

Maybe the purpose of this design is to be consistent with fixed-width vectors. However, the scenario is different: fixed-width vectors (e.g. IntVector) throw an IllegalStateException, simply because the primitive types are non-nullable.

@wesm wesm changed the title [ARROW-5973][Java] Variable width vectors' get methods should return null when the underlying data is null ARROW-5973: [Java] Variable width vectors' get methods should return null when the underlying data is null Jul 18, 2019
@codecov-io
Copy link

codecov-io commented Jul 18, 2019

Codecov Report

Merging #4901 into master will increase coverage by 2.27%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #4901      +/-   ##
==========================================
+ Coverage   87.31%   89.58%   +2.27%     
==========================================
  Files         894      661     -233     
  Lines      134066    96617   -37449     
==========================================
- Hits       117055    86553   -30502     
+ Misses      16659    10064    -6595     
+ Partials      352        0     -352
Impacted Files Coverage Δ
cpp/src/plasma/thirdparty/ae/ae.c 70.75% <0%> (-0.95%) ⬇️
r/src/recordbatch.cpp
r/R/Table.R
go/arrow/math/uint64_amd64.go
r/src/symbols.cpp
rust/datafusion/src/execution/filter.rs
rust/arrow/src/csv/writer.rs
rust/datafusion/src/bin/main.rs
go/arrow/ipc/file_reader.go
r/src/arrow_types.h
... and 224 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 1abf18f...8fe83f7. Read the comment docs.

@emkornfield
Copy link
Contributor

This seems like the right functionality. It looks like it impacts FixedSizeBinary as well (https://github.com/apache/arrow/blob/a222c7d4ddfa43ee35a6ecf1c1563b48ecd6b860/java/vector/src/main/java/org/apache/arrow/vector/FixedSizeBinaryVector.java)

@praveenbingo @pravindra any concerns with the change in functionality?

@liyafan82
Copy link
Contributor Author

This seems like the right functionality. It looks like it impacts FixedSizeBinary as well (https://github.com/apache/arrow/blob/a222c7d4ddfa43ee35a6ecf1c1563b48ecd6b860/java/vector/src/main/java/org/apache/arrow/vector/FixedSizeBinaryVector.java)

@praveenbingo @pravindra any concerns with the change in functionality?

@emkornfield Revised accordingly. Thanks a lot for your kind reminder.

Copy link
Contributor

@pravindra pravindra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@emkornfield
Copy link
Contributor

+1, we should start a list someplace so we don't forget for the next release behavior changes like this. So far I think we have:

  1. changing the lastSet value on some of the list vectors?
  2. This change?
  3. Am I forgetting something?

@liyafan82
Copy link
Contributor Author

+1, we should start a list someplace so we don't forget for the next release behavior changes like this. So far I think we have:

  1. changing the lastSet value on some of the list vectors?
  2. This change?
  3. Am I forgetting something?

Good suggestion. Let's start a list in the mailing list.

kszucs pushed a commit that referenced this pull request Jul 22, 2019
…null when the underlying data is null

For variable-width vectors (VarCharVector and VarBinaryVector), when the validity bit is not set, it means the underlying data is null, so the get method should return null.

However, the current implementation throws an IllegalStateException when NULL_CHECKING_ENABLED is set, or returns an empty array when the flag is clear.

Maybe the purpose of this design is to be consistent with fixed-width vectors. However, the scenario is different: fixed-width vectors (e.g. IntVector) throw an IllegalStateException, simply because the primitive types are non-nullable.

Author: liyafan82 <fan_li_ya@foxmail.com>

Closes #4901 from liyafan82/fly_0717_varget and squashes the following commits:

8fe83f7 <liyafan82>  Variable width vectors' get methods should return null when the underlying data is null
pribor pushed a commit to GlobalWebIndex/arrow that referenced this pull request Oct 24, 2025
…null when the underlying data is null

For variable-width vectors (VarCharVector and VarBinaryVector), when the validity bit is not set, it means the underlying data is null, so the get method should return null.

However, the current implementation throws an IllegalStateException when NULL_CHECKING_ENABLED is set, or returns an empty array when the flag is clear.

Maybe the purpose of this design is to be consistent with fixed-width vectors. However, the scenario is different: fixed-width vectors (e.g. IntVector) throw an IllegalStateException, simply because the primitive types are non-nullable.

Author: liyafan82 <fan_li_ya@foxmail.com>

Closes apache#4901 from liyafan82/fly_0717_varget and squashes the following commits:

8fe83f7 <liyafan82>  Variable width vectors' get methods should return null when the underlying data is null
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants