Skip to content

Conversation

@julienledem
Copy link
Member

@julienledem julienledem commented Apr 19, 2017

The current java vector support the timezone less version of time related types but in an incomplete way.
This change fixes it and clarifies what the vectors implementation do.
We'll need separate vectors or adapt those to deal will timezone aware time types.

@wesm
Copy link
Member

wesm commented Apr 19, 2017

I suppose failing the integration tests is good?

With output:
--------------
<SNIP>
22:26:05.180 [main] DEBUG o.a.arrow.vector.file.ReadChannel - Reading buffer with size: 10
22:26:05.184 [main] DEBUG o.a.a.vector.file.ArrowFileReader - Footer starts at 5992, length: 1064
22:26:05.184 [main] DEBUG o.a.arrow.vector.file.ReadChannel - Reading buffer with size: 1064
Incompatible files
only timezone-less timestamps are supported for now: Timestamp(MILLISECOND, America/New_York)
22:26:05.281 [main] ERROR org.apache.arrow.tools.Integration - Incompatible files
java.lang.IllegalArgumentException: only timezone-less timestamps are supported for now: Timestamp(MILLISECOND, America/New_York)
	at org.apache.arrow.vector.types.Types$1.visit(Types.java:584) ~[arrow-tools-0.2.1-SNAPSHOT-jar-with-dependencies.jar:na]
	at org.apache.arrow.vector.types.Types$1.visit(Types.java:489) ~[arrow-tools-0.2.1-SNAPSHOT-jar-with-dependencies.jar:na]
	at org.apache.arrow.vector.types.pojo.ArrowType$Timestamp.accept(ArrowType.java:929) ~[arrow-tools-0.2.1-SNAPSHOT-jar-with-dependencies.jar:na]
	at org.apache.arrow.vector.types.Types.getMinorTypeForArrowType(Types.java:489) ~[arrow-tools-0.2.1-SNAPSHOT-jar-with-dependencies.jar:na]
	at org.apache.arrow.vector.types.pojo.FieldType.createNewSingleVector(FieldType.java:56) ~[arrow-tools-0.2.1-SNAPSHOT-jar-with-dependencies.jar:na]
	at org.apache.arrow.vector.types.pojo.Field.createVector(Field.java:89) ~[arrow-tools-0.2.1-SNAPSHOT-jar-with-dependencies.jar:na]
	at org.apache.arrow.vector.file.ArrowReader.initialize(ArrowReader.java:160) ~[arrow-tools-0.2.1-SNAPSHOT-jar-with-dependencies.jar:na]
	at org.apache.arrow.vector.file.ArrowReader.ensureInitialized(ArrowReader.java:143) ~[arrow-tools-0.2.1-SNAPSHOT-jar-with-dependencies.jar:na]
	at org.apache.arrow.vector.file.ArrowReader.getVectorSchemaRoot(ArrowReader.java:68) ~[arrow-tools-0.2.1-SNAPSHOT-jar-with-dependencies.jar:na]
	at org.apache.arrow.tools.Integration$Command$3.execute(Integration.java:171) ~[arrow-tools-0.2.1-SNAPSHOT-jar-with-dependencies.jar:na]
	at org.apache.arrow.tools.Integration.run(Integration.java:101) ~[arrow-tools-0.2.1-SNAPSHOT-jar-with-dependencies.jar:na]
	at org.apache.arrow.tools.Integration.main(Integration.java:62) ~[arrow-tools-0.2.1-SNAPSHOT-jar-with-dependencies.jar:na]

@wesm
Copy link
Member

wesm commented Apr 21, 2017

@julienledem could you take a look at the integration test failure?

@julienledem
Copy link
Member Author

@wesm Yes, failing the test is good. FYI: I removed this JIRA from the 0.3 release. I'll update the integration test.

@wesm
Copy link
Member

wesm commented Apr 24, 2017

Cool

@jacques-n
Copy link
Contributor

Did you evaluate using the java8 apis instead of the joda ones? There is this backport for java 7: http://www.threeten.org/threetenbp/.

@julienledem
Copy link
Member Author

@jacques-n Since those classes use a different package name, it does not look like there's a benefit to have them as an intermediate step. I'd rather move to the actual java 8 classes once arrow is on java 8. Moving to LocalDateTime is already a pretty big change without adding this on top.

@jacques-n
Copy link
Contributor

Got it, makes sense, thanks for the explanation @julienledem.

Change-Id: I752bb6760c9ae3474e5af73593cd334872540d75
Change-Id: Ia375f6cf2164ee0c65582fc1b94483572192d565
Change-Id: I3265d9ff676e7090d2d77b143c36a83fbf3f8e81
Change-Id: I45026ec7f9a6afab6d6a63f18048872db5ac2e24
@julienledem
Copy link
Member Author

This is good to go IMO

@wesm
Copy link
Member

wesm commented May 3, 2017

Removing the time zone feels like a regression. How difficult is it to preserve the time zone as metadata only?

Change-Id: I1ae9518d401056143208206695a2c54fb94df3e1
@julienledem
Copy link
Member Author

@wesm The current vector implementation of the TimestampVector is TimezoneLess (except it was buggy because it used DateTime instead of LocalDateTime).
It worked in the integration tests because we never went through the Accessor.getObject/Reader/Writer java apis essentially just looking at the numeric timestamp value and ignoring the timezone.
To have a correct implementation we need a separate TZTimestampVector or a major refactor to decouple this logic from the vector.

@wesm
Copy link
Member

wesm commented May 6, 2017

Got it. I am ok with this, then. Let's open a JIRA to make a plan and implement something that works in time for 0.4 (end of this month)?

@julienledem
Copy link
Member Author

@wesm OK: I have started the new vectors and some template cleanup here: 3e3ac84

Copy link
Member

@wesm wesm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@asfgit asfgit closed this in 316c63d May 6, 2017
jeffknupp pushed a commit to jeffknupp/arrow that referenced this pull request Jun 3, 2017
The current java vector support the timezone less version of time related types but in an incomplete way.
This change fixes it and clarifies what the vectors implementation do.
We'll need separate vectors or adapt those to deal will timezone aware time types.

Author: Julien Le Dem <julien@apache.org>

Closes apache#568 from julienledem/ARROW-824 and squashes the following commits:

3528ad8 [Julien Le Dem] add license
e37385c [Julien Le Dem] centralize LocalDateTime.toMillis
bdac7ff [Julien Le Dem] make integration tests output more readable
b0da88c [Julien Le Dem] fix failing integration test
61518ec [Julien Le Dem] improve travis integration
ec19e7d [Julien Le Dem] ARROW-824: Date and Time Vectors should reflect timezone-less semantics
pribor pushed a commit to GlobalWebIndex/arrow that referenced this pull request Oct 24, 2025
The current java vector support the timezone less version of time related types but in an incomplete way.
This change fixes it and clarifies what the vectors implementation do.
We'll need separate vectors or adapt those to deal will timezone aware time types.

Author: Julien Le Dem <julien@apache.org>

Closes apache#568 from julienledem/ARROW-824 and squashes the following commits:

3528ad8 [Julien Le Dem] add license
e37385c [Julien Le Dem] centralize LocalDateTime.toMillis
bdac7ff [Julien Le Dem] make integration tests output more readable
b0da88c [Julien Le Dem] fix failing integration test
61518ec [Julien Le Dem] improve travis integration
ec19e7d [Julien Le Dem] ARROW-824: Date and Time Vectors should reflect timezone-less semantics
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants