GH-37553: [Java] Allow FlightInfo#Schema to be nullable for long-running queries#37528
GH-37553: [Java] Allow FlightInfo#Schema to be nullable for long-running queries#37528lidavidm merged 1 commit intoapache:mainfrom
Conversation
|
|
|
It seems that the current implementation already accepts the empty schema message. And an empty schema is used for the case. How about passing an empty instead of accepting |
|
I think there are two reasons:
Protobuf doesn't differentiate a null and an absent schema string, but an empty schema should serialize to a non-null string, so we can still tell the difference, too. |
|
My 0.02 cents: as a user of Arrow, it's not obvious that I should represent a missing schema as a new I tried to browse the codebase for conventions I could learn from, and saw |
|
Thanks. I understand. Could you create a new sub-issue of apache/arrow-java#167 (like #36620 and #36643) only for this (update the Java implementation)? |
java/flight/flight-core/src/main/java/org/apache/arrow/flight/FlightInfo.java
Outdated
Show resolved
Hide resolved
java/flight/flight-core/src/main/java/org/apache/arrow/flight/FlightInfo.java
Outdated
Show resolved
Hide resolved
Yes, the C++ implementation has the same (mistaken) assumption. |
I created #37554 for the docs update, and #37553 specifically for the Java update, and they are part of a task list in apache/arrow-java#167. I would like some help on the C++ issue. :) |
|
|
|
I can likely help for C++, will just take me a while to be able to get around to it |
lidavidm
left a comment
There was a problem hiding this comment.
FlightInfo#equals still needs updating
java/flight/flight-core/src/main/java/org/apache/arrow/flight/FlightInfo.java
Outdated
Show resolved
Hide resolved
java/flight/flight-core/src/main/java/org/apache/arrow/flight/FlightInfo.java
Outdated
Show resolved
Hide resolved
java/flight/flight-core/src/main/java/org/apache/arrow/flight/FlightInfo.java
Outdated
Show resolved
Hide resolved
java/flight/flight-core/src/main/java/org/apache/arrow/flight/FlightInfo.java
Outdated
Show resolved
Hide resolved
java/flight/flight-core/src/main/java/org/apache/arrow/flight/FlightProducer.java
Outdated
Show resolved
Hide resolved
java/vector/src/main/java/org/apache/arrow/vector/ipc/message/MessageSerializer.java
Outdated
Show resolved
Hide resolved
2cfa4d2 to
567ee6c
Compare
lidavidm
left a comment
There was a problem hiding this comment.
Thanks! Just a couple final nits.
java/flight/flight-core/src/main/java/org/apache/arrow/flight/FlightInfo.java
Outdated
Show resolved
Hide resolved
java/vector/src/main/java/org/apache/arrow/vector/ipc/message/MessageSerializer.java
Outdated
Show resolved
Hide resolved
…g-running queries With apache#36155, implementations of Flight RPC may not return quickly via a newly added pollFlightInfo function. Sometimes, the system implementing this function may not know the output schema for some time--for example, after a lengthy queue time has elapsed, or after planning. In proto3, fields may not be present, and it's a coding convention to require them. To support upcoming client integration work for pollFlightInfo, the schema field can be made optional so that it's not a requirement to populate the FlightInfo's schema on the first pollFlightInfo request. We can modify our client code to allow this field to be optional. This is already the case for the Go code. This changes the Java client code to allow the Schema to be null. `getSchema` methods now return `Optional<Schema>`, which is a backwards incompatible change.
|
(I tweaked the PR description to reflect the changes made.) |
|
After merging your PR, Conbench analyzed the 5 benchmarking runs that have been run so far on merge-commit 48cc2ab. There were no benchmark performance regressions. 🎉 The full Conbench report has more details. It also includes information about possible false positives for unstable benchmarks that are known to sometimes produce them. |
|
No problem. Feel free to reach out with questions about Flight, since I see there's a discussion going on for Presto |
…g-running queries (apache#37528) With apache#36155, implementations of Flight RPC may not return quickly via a newly added pollFlightInfo function. Sometimes, the system implementing this function may not know the output schema for some time--for example, after a lengthy queue time as elapsed, or after planning. In proto3, fields may not be present, and it's a coding convention to require them 1. To support upcoming client integration work for pollFlightInfo, the schema field can be made optional so that it's not a requirement to populate the FlightInfo's schema on the first pollFlightInfo request. We can modify our client code to allow this field to be optional. This is already the case for the Go code. This changes the Java client code to allow the Schema to be null. A new `getSchemaOptional` method returns `Optional<Schema>`, which is a backwards compatible change. The existing method is deprecated, but will still return an empty schema if the schema is not present on wire (as it used to before). ### Rationale for this change With apache#36155, implementations of Flight RPC may not return quickly via a newly added pollFlightInfo function. Sometimes, the system implementing this function may not know the output schema for some time--for example, after a lengthy queue time as elapsed, or after planning. In proto3, fields may not be present, and it's a coding convention to require them 1. To support upcoming client integration work for pollFlightInfo, the schema field can be made optional so that it's not a requirement to populate the FlightInfo's schema on the first pollFlightInfo request. CC: `@ lidavidm` ### What changes are included in this PR? This changes the Java client code to allow the Schema to be null. `getSchema` is now deprecated and a new `getSchemaOptional` returns `Optional<Schema>`, which is a backwards compatible change. ### Are these changes tested? Existing tests ensure serialization and deserialization continue to work. ### Are there any user-facing changes? The `getSchema` methods are now deprecated in favor of `getSchemaOptional`. * Closes: apache#37553 Authored-by: Tim Meehan <tim@timdmeehan.com> Signed-off-by: David Li <li.davidm96@gmail.com>
…g-running queries (apache#37528) With apache#36155, implementations of Flight RPC may not return quickly via a newly added pollFlightInfo function. Sometimes, the system implementing this function may not know the output schema for some time--for example, after a lengthy queue time as elapsed, or after planning. In proto3, fields may not be present, and it's a coding convention to require them 1. To support upcoming client integration work for pollFlightInfo, the schema field can be made optional so that it's not a requirement to populate the FlightInfo's schema on the first pollFlightInfo request. We can modify our client code to allow this field to be optional. This is already the case for the Go code. This changes the Java client code to allow the Schema to be null. A new `getSchemaOptional` method returns `Optional<Schema>`, which is a backwards compatible change. The existing method is deprecated, but will still return an empty schema if the schema is not present on wire (as it used to before). ### Rationale for this change With apache#36155, implementations of Flight RPC may not return quickly via a newly added pollFlightInfo function. Sometimes, the system implementing this function may not know the output schema for some time--for example, after a lengthy queue time as elapsed, or after planning. In proto3, fields may not be present, and it's a coding convention to require them 1. To support upcoming client integration work for pollFlightInfo, the schema field can be made optional so that it's not a requirement to populate the FlightInfo's schema on the first pollFlightInfo request. CC: `@ lidavidm` ### What changes are included in this PR? This changes the Java client code to allow the Schema to be null. `getSchema` is now deprecated and a new `getSchemaOptional` returns `Optional<Schema>`, which is a backwards compatible change. ### Are these changes tested? Existing tests ensure serialization and deserialization continue to work. ### Are there any user-facing changes? The `getSchema` methods are now deprecated in favor of `getSchemaOptional`. * Closes: apache#37553 Authored-by: Tim Meehan <tim@timdmeehan.com> Signed-off-by: David Li <li.davidm96@gmail.com>
…g-running queries (apache#37528) With apache#36155, implementations of Flight RPC may not return quickly via a newly added pollFlightInfo function. Sometimes, the system implementing this function may not know the output schema for some time--for example, after a lengthy queue time as elapsed, or after planning. In proto3, fields may not be present, and it's a coding convention to require them 1. To support upcoming client integration work for pollFlightInfo, the schema field can be made optional so that it's not a requirement to populate the FlightInfo's schema on the first pollFlightInfo request. We can modify our client code to allow this field to be optional. This is already the case for the Go code. This changes the Java client code to allow the Schema to be null. A new `getSchemaOptional` method returns `Optional<Schema>`, which is a backwards compatible change. The existing method is deprecated, but will still return an empty schema if the schema is not present on wire (as it used to before). ### Rationale for this change With apache#36155, implementations of Flight RPC may not return quickly via a newly added pollFlightInfo function. Sometimes, the system implementing this function may not know the output schema for some time--for example, after a lengthy queue time as elapsed, or after planning. In proto3, fields may not be present, and it's a coding convention to require them 1. To support upcoming client integration work for pollFlightInfo, the schema field can be made optional so that it's not a requirement to populate the FlightInfo's schema on the first pollFlightInfo request. CC: `@ lidavidm` ### What changes are included in this PR? This changes the Java client code to allow the Schema to be null. `getSchema` is now deprecated and a new `getSchemaOptional` returns `Optional<Schema>`, which is a backwards compatible change. ### Are these changes tested? Existing tests ensure serialization and deserialization continue to work. ### Are there any user-facing changes? The `getSchema` methods are now deprecated in favor of `getSchemaOptional`. * Closes: apache#37553 Authored-by: Tim Meehan <tim@timdmeehan.com> Signed-off-by: David Li <li.davidm96@gmail.com>
With #36155, implementations of Flight RPC may not return quickly via a newly added pollFlightInfo function. Sometimes, the system implementing this function may not know the output schema for some time--for example, after a lengthy queue time as elapsed, or after planning.
In proto3, fields may not be present, and it's a coding convention to require them 1. To support upcoming client integration work for pollFlightInfo, the schema field can be made optional so that it's not a requirement to populate the FlightInfo's schema on the first pollFlightInfo request.
We can modify our client code to allow this field to be optional. This is already the case for the Go code.
This changes the Java client code to allow the Schema to be null. A new
getSchemaOptionalmethod returnsOptional<Schema>, which is a backwards compatible change. The existing method is deprecated, but will still return an empty schema if the schema is not present on wire (as it used to before).Rationale for this change
With #36155, implementations of Flight RPC may not return quickly via a newly added pollFlightInfo function. Sometimes, the system implementing this function may not know the output schema for some time--for example, after a lengthy queue time as elapsed, or after planning.
In proto3, fields may not be present, and it's a coding convention to require them 1. To support upcoming client integration work for pollFlightInfo, the schema field can be made optional so that it's not a requirement to populate the FlightInfo's schema on the first pollFlightInfo request.
CC:
@lidavidmWhat changes are included in this PR?
This changes the Java client code to allow the Schema to be null.
getSchemais now deprecated and a newgetSchemaOptionalreturnsOptional<Schema>, which is a backwards compatible change.Are these changes tested?
Existing tests ensure serialization and deserialization continue to work.
Are there any user-facing changes?
The
getSchemamethods are now deprecated in favor ofgetSchemaOptional.