Skip to content

GH-40893: [Java][FlightRPC] Support IntervalMonthDayNanoVector in FlightSQL JDBC Driver#40894

Merged
lidavidm merged 5 commits intoapache:mainfrom
pgwhalen:flightsql-support-month-day-nano-interval
Mar 31, 2024
Merged

GH-40893: [Java][FlightRPC] Support IntervalMonthDayNanoVector in FlightSQL JDBC Driver#40894
lidavidm merged 5 commits intoapache:mainfrom
pgwhalen:flightsql-support-month-day-nano-interval

Conversation

@pgwhalen
Copy link
Copy Markdown
Contributor

@pgwhalen pgwhalen commented Mar 29, 2024

Rationale for this change

Fixes #40893.

What changes are included in this PR?

  • Support IntervalMonthDayNanoVector in FlightSQL JDBC Driver
  • Return PeriodDuration as JDBC Object type, because there is no good java.time type for this interval
  • Return an ISO-8601 interval as the stringified version of PeriodDuration
  • Make PeriodDuration implement TemporalAccessor for standardization

Are these changes tested?

Unit tests have been added that match those for other interval types. I'm unaware of any other types of tests worth adding to, but I'd be happy to if pointed there.

Are there any user-facing changes?

The only change users should noticed is that the FlightSQL JDBC Driver can now handle more query responses.

 - Return PeriodDuration as JDBC Object type, because there is no good
 java.time type for this interval
 - Use the toString() of PeriodDuration as the StringGetter, because it isn't
 clear what a better stringification of the type would be.  IntervalStringUtils
 converts other intervals to Oracle's string representation of interval types,
 but there doesn't appear to be an analog for this type
@pgwhalen pgwhalen requested a review from lidavidm as a code owner March 29, 2024 16:19
@github-actions
Copy link
Copy Markdown

Thanks for opening a pull request!

If this is not a minor PR. Could you open an issue for this pull request on GitHub? https://github.com/apache/arrow/issues/new/choose

Opening GitHub issues ahead of time contributes to the Openness of the Apache Arrow project.

Then could you also rename the pull request title in the following format?

GH-${GITHUB_ISSUE_ID}: [${COMPONENT}] ${SUMMARY}

or

MINOR: [${COMPONENT}] ${SUMMARY}

In the case of PARQUET issues on JIRA the title also supports:

PARQUET-${JIRA_ISSUE_ID}: [${COMPONENT}] ${SUMMARY}

See also:

@pgwhalen pgwhalen changed the title GH- 40893: [Java][FlightRPC] Support IntervalMonthDayNanoVector in FlightSQL JDBC Driver GH-40893: [Java][FlightRPC] Support IntervalMonthDayNanoVector in FlightSQL JDBC Driver Mar 29, 2024
@github-actions
Copy link
Copy Markdown

⚠️ GitHub issue #40893 has been automatically assigned in GitHub to PR creator.

@pgwhalen
Copy link
Copy Markdown
Contributor Author

If there is an idea for a better "stringification" of this type, I'm happy to implement it. The postgres interval type comes to mind, though that would be inconsistent with the existing Oracle type stringifications.

@laurentgo
Copy link
Copy Markdown
Contributor

laurentgo commented Mar 29, 2024

If there is an idea for a better "stringification" of this type, I'm happy to implement it. The postgres interval type comes to mind, though that would be inconsistent with the existing Oracle type stringifications.

Isn't ISO-8601/RFC3339 a good and relatively well-known format?

@pgwhalen
Copy link
Copy Markdown
Contributor Author

Isn't ISO-8601/RFC3339 a good and relatively well-known format?

I think that should work. The toString() of Period and Duration produce ISO-8601 compatible strings already. I thought they had overlapping units but upon second glance it doesn't appear they do, so I can more or less concatenate them and get a string.

@lidavidm
Copy link
Copy Markdown
Member

I was going to ask what other drivers do, but it appears Postgres also has a custom type (PGInterval). So this is probably reasonable, except I wonder, since we shade the JAR for distribution, this will force people to refer to a shaded class?

@lidavidm
Copy link
Copy Markdown
Member

I suppose Joda Time's period was not fully adopted into Java, as its Period does have time units... https://www.joda.org/joda-time/apidocs/org/joda/time/Period.html

@laurentgo
Copy link
Copy Markdown
Contributor

I suppose Joda Time's period was not fully adopted into Java, as its Period does have time units... https://www.joda.org/joda-time/apidocs/org/joda/time/Period.html

I guess the driver could implement java.time.TemporalAmount which would be as standard as possible?

@pgwhalen
Copy link
Copy Markdown
Contributor Author

So what I'm going for:

  • PeriodDuration is still the object returned by getObject() to JDBC clients (matching the return type on the underlying vector)
  • PeriodDuration should implement TemporalAmount
  • The string returned by its StringGetter should be an ISO-8601 interval

Does that sound right @laurentgo @lidavidm ?

It's perhaps worth pointing out that I'll basically be re-implementing ThreeTen's PeriodDuration class. I assume it's not desirable to replace Arrow's PeriodDuration with that class entirely.

@lidavidm
Copy link
Copy Markdown
Member

That sounds good to me, thank you.

We probably don't want to take a dependency on that, no. I suppose we have to just live with the shaded name, too (probably for the best if there's no stdlib type that fits).

@github-actions github-actions bot added awaiting merge Awaiting merge and removed awaiting review Awaiting review labels Mar 31, 2024
@lidavidm
Copy link
Copy Markdown
Member

lidavidm commented Mar 31, 2024

Hmm, it appears ISO8601 durations as proposed for ECMAScript do not allow negative components. But Java does this (as we see here), other systems like Pandas do as well, and at some point the ISO spec was revised to allow it (even though this doesn't seem to have been made easily accessible...): https://www.postgresql.org/message-id/9q0ftb37dv7.fsf%40gmx.us

so I'm ok with it.

@conbench-apache-arrow
Copy link
Copy Markdown

After merging your PR, Conbench analyzed the 6 benchmarking runs that have been run so far on merge-commit 17a5368.

There was 1 benchmark result indicating a performance regression:

The full Conbench report has more details. It also includes information about 3 possible false positives for unstable benchmarks that are known to sometimes produce them.

vibhatha pushed a commit to vibhatha/arrow that referenced this pull request May 25, 2024
…in FlightSQL JDBC Driver (apache#40894)

### Rationale for this change

Fixes apache#40893.

### What changes are included in this PR?

 - Support IntervalMonthDayNanoVector in FlightSQL JDBC Driver
 - Return PeriodDuration as JDBC Object type, because there is no good java.time type for this interval
 - Return an ISO-8601 interval as the stringified version of PeriodDuration
 - Make PeriodDuration implement TemporalAccessor for standardization

### Are these changes tested?

Unit tests have been added that match those for other interval types.  I'm unaware of any other types of tests worth adding to, but I'd be happy to if pointed there.

### Are there any user-facing changes?

The only change users should noticed is that the FlightSQL JDBC Driver can now handle more query responses.
* GitHub Issue: apache#40893

Authored-by: paul <pgwhalen@gmail.com>
Signed-off-by: David Li <li.davidm96@gmail.com>
pribor pushed a commit to GlobalWebIndex/arrow that referenced this pull request Oct 24, 2025
…in FlightSQL JDBC Driver (apache#40894)

### Rationale for this change

Fixes apache#40893.

### What changes are included in this PR?

 - Support IntervalMonthDayNanoVector in FlightSQL JDBC Driver
 - Return PeriodDuration as JDBC Object type, because there is no good java.time type for this interval
 - Return an ISO-8601 interval as the stringified version of PeriodDuration
 - Make PeriodDuration implement TemporalAccessor for standardization

### Are these changes tested?

Unit tests have been added that match those for other interval types.  I'm unaware of any other types of tests worth adding to, but I'd be happy to if pointed there.

### Are there any user-facing changes?

The only change users should noticed is that the FlightSQL JDBC Driver can now handle more query responses.
* GitHub Issue: apache#40893

Authored-by: paul <pgwhalen@gmail.com>
Signed-off-by: David Li <li.davidm96@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Java][FlightRPC] IntervalMonthDayNanoVector should be supported in FlightSQL JDBC Driver

3 participants