Skip to content

Conversation

@pitrou
Copy link
Member

@pitrou pitrou commented Jul 30, 2019

No description provided.

@pitrou
Copy link
Member Author

pitrou commented Jul 30, 2019

@emkornfield

@pitrou pitrou force-pushed the ARROW-4810-large-list branch 2 times, most recently from 87bb90a to ca91e2f Compare July 30, 2019 16:57
@emkornfield emkornfield self-requested a review July 30, 2019 18:22
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You might want to check someplace that the offset buffer has the expected size?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's done in ValidateArray (well, it checks it has at least the required size)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why this change?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because testing a list type with int32 values while offsets are also int32 is not gonna catch some possible sources of errors.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ListArrayType?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That implies it's an Array type. I can call it TypeClass if you want.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is OK for now.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this change seems unrelated?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I added type_name() to all instantiable type classes. The motivation is to allow producing nice error messages even when you don't have a DataType instance, only class (for example in a concrete type-templated function).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this limit people that want to extend the code base in the future?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know, why would it?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if name() actually had to contain non-trivial logic? I wonder why it had this signature to begin with.

Copy link
Member Author

@pitrou pitrou Jul 31, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know, it seems the non-trivial logic ends up in ToString() by design.
As for the rationale behind name(), perhaps @wesm or @xhochy know about the history?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think the name() function is important. It's been around since the beginning of the project but I don't see it used much

@codecov-io
Copy link

codecov-io commented Jul 30, 2019

Codecov Report

Merging #4969 into master will decrease coverage by 0.41%.
The diff coverage is 95.83%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #4969      +/-   ##
==========================================
- Coverage   87.98%   87.56%   -0.42%     
==========================================
  Files         910     1000      +90     
  Lines      133521   142570    +9049     
  Branches     1418     1418              
==========================================
+ Hits       117483   124848    +7365     
- Misses      16028    17360    +1332     
- Partials       10      362     +352
Impacted Files Coverage Δ
cpp/src/arrow/extension_type.cc 95.45% <ø> (+2.12%) ⬆️
cpp/src/arrow/array/builder_nested.cc 90.55% <ø> (-2.12%) ⬇️
cpp/src/arrow/visitor.h 50% <ø> (ø) ⬆️
cpp/src/arrow/pretty_print.cc 83.33% <ø> (ø) ⬆️
cpp/src/arrow/visitor_inline.h 88.78% <ø> (ø) ⬆️
cpp/src/arrow/type_traits.h 86.2% <ø> (ø) ⬆️
cpp/src/parquet/arrow/writer.cc 96.53% <0%> (-0.19%) ⬇️
cpp/src/arrow/scalar.cc 63.33% <0%> (ø) ⬆️
cpp/src/arrow/scalar.h 87.5% <0%> (ø) ⬆️
cpp/src/arrow/extension_type.h 85.71% <0%> (-14.29%) ⬇️
... and 122 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update cac4957...a1d93b2. Read the comment docs.

@emkornfield
Copy link
Contributor

Note to self, old proof of concept for java 03956ca

Copy link
Contributor

@fsaintjacques fsaintjacques left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bravo.

@pitrou pitrou force-pushed the ARROW-4810-large-list branch from ca91e2f to a1d93b2 Compare July 31, 2019 11:40
@pitrou
Copy link
Member Author

pitrou commented Jul 31, 2019

@emkornfield
Copy link
Contributor

+1, thank you

@pitrou pitrou deleted the ARROW-4810-large-list branch August 1, 2019 08:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants