feat(native): Add an Arrow federation connector to run federated queries#26404
feat(native): Add an Arrow federation connector to run federated queries#26404pdabre12 wants to merge 1 commit into
Conversation
There was a problem hiding this comment.
Sorry @pdabre12, your pull request is larger than the review limit of 150000 diff characters
af94842 to
906eb52
Compare
|
@sourcery-ai review |
|
Sorry @pdabre12, your pull request is larger than the review limit of 150000 diff characters |
1 similar comment
|
Sorry @pdabre12, your pull request is larger than the review limit of 150000 diff characters |
906eb52 to
ef1126d
Compare
a4b4cfa to
65e8e43
Compare
743fb7d to
5b1d093
Compare
1c6c870 to
21bb5c5
Compare
|
Is the proposed design to run the Flight server as a plugin in the single node Java coordinator itself. Does all the data fetch happen through the Java coordinator itself then? |
@elbinpallimalilibm The proposed design is to have a separate Flight server shim—a lightweight Java process—running independently. The data fetching would occur through this process. We use a similar connector plugin loading logic as in a Java coordinator. |
|
When would having a Presto CPP worker + Arrow Flight connector + Java Flight server shim be more advantageous than Presto Java worker with JDBC connector. |
e81c9b4 to
0a064ca
Compare
Arrow Flight server that can load Presto Java connectors, and provide a record batch stream from the given connector split. The FlightShim server is designed to work with a native arrow connector, that will use a Flight client to forward the connector split to the server and process the record batch stream. See related PR at prestodb#26404 RFC: prestodb/rfcs#46 Co-authored-by: Pratik Joseph Dabre <pdabre12@gmail.com>
Arrow Flight server that can load Presto Java connectors, and provide a record batch stream from the given connector split. The FlightShim server is designed to work with a native arrow connector, that will use a Flight client to forward the connector split to the server and process the record batch stream. See related PR at prestodb#26404 RFC: prestodb/rfcs#46 Co-authored-by: Pratik Joseph Dabre <pdabre12@gmail.com>
Arrow Flight server that can load Presto Java connectors, and provide a record batch stream from the given connector split. The FlightShim server is designed to work with a native arrow connector, that will use a Flight client to forward the connector split to the server and process the record batch stream. See related PR at prestodb#26404 RFC: prestodb/rfcs#46 Co-authored-by: Pratik Joseph Dabre <pdabre12@gmail.com>
Arrow Flight server that can load Presto Java connectors, and provide a record batch stream from the given connector split. The FlightShim server is designed to work with a native arrow connector, that will use a Flight client to forward the connector split to the server and process the record batch stream. See related PR at prestodb#26404 RFC: prestodb/rfcs#46 Co-authored-by: Pratik Joseph Dabre <pdabre12@gmail.com>
112b354 to
7d1042d
Compare
Arrow Flight server that can load Presto Java connectors, and provide a record batch stream from the given connector split. The FlightShim server is designed to work with a native arrow connector, that will use a Flight client to forward the connector split to the server and process the record batch stream. See related PR at prestodb#26404 RFC: prestodb/rfcs#46 Co-authored-by: Pratik Joseph Dabre <pdabre12@gmail.com>
61cb308 to
9679ddc
Compare
Arrow Flight server that can load Presto Java connectors, and provide a record batch stream from the given connector split. The FlightShim server is designed to work with a native arrow connector, that will use a Flight client to forward the connector split to the server and process the record batch stream. See related PR at prestodb#26404 RFC: prestodb/rfcs#46 Co-authored-by: Pratik Joseph Dabre <pdabre12@gmail.com>
Arrow Flight server that can load Presto Java connectors, and provide a record batch stream from the given connector split. The FlightShim server is designed to work with a native arrow connector, that will use a Flight client to forward the connector split to the server and process the record batch stream. See related PR at prestodb#26404 RFC: prestodb/rfcs#46 Co-authored-by: Pratik Joseph Dabre <pdabre12@gmail.com>
Arrow Flight server that can load Presto Java connectors, and provide a record batch stream from the given connector split. The FlightShim server is designed to work with a native arrow connector, that will use a Flight client to forward the connector split to the server and process the record batch stream. See related PR at prestodb#26404 RFC: prestodb/rfcs#46 Co-authored-by: Pratik Joseph Dabre <pdabre12@gmail.com>
b795a39 to
5150d67
Compare
Arrow Flight server that can load Presto Java connectors, and provide a record batch stream from the given connector split. The FlightShim server is designed to work with a native arrow connector, that will use a Flight client to forward the connector split to the server and process the record batch stream. See related PR at prestodb#26404 RFC: prestodb/rfcs#46 Co-authored-by: Pratik Joseph Dabre <pdabre12@gmail.com>
ad83804 to
f5a9535
Compare
Arrow Flight server that can load Presto Java connectors, and provide a record batch stream from the given connector split. The FlightShim server is designed to work with a native arrow connector, that will use a Flight client to forward the connector split to the server and process the record batch stream. See related PR at prestodb#26404 RFC: prestodb/rfcs#46 Co-authored-by: Pratik Joseph Dabre <pdabre12@gmail.com>
Arrow Flight server that can load Presto Java connectors, and provide a record batch stream from the given connector split. The FlightShim server is designed to work with a native arrow connector, that will use a Flight client to forward the connector split to the server and process the record batch stream. See related PR at prestodb#26404 RFC: prestodb/rfcs#46 Co-authored-by: Pratik Joseph Dabre <pdabre12@gmail.com>
b6a250e to
426740b
Compare
…workers (#26369) ## Description This adds a FlightShim module for connector federation. This includes an Apache Arrow Flight server that can load Presto Java connectors, and provide a record batch stream from the given connector split. The FlightShim server is designed to work with a native arrow connector, that will use a Flight client to forward the connector split to the server and process the record batch stream. See related PR at #26404 RFC: prestodb/rfcs#46 ## Motivation and Context <!---Why is this change required? What problem does it solve?--> <!---If it fixes an open issue, please link to the issue here.--> This provides a path to connector federation with native workers ## Impact <!---Describe any public API or user-facing feature change or any performance impact--> No API changes required. ## Test Plan <!---Please fill in how you tested your change--> Included unit tests for the FlightShim server functionality. The followup native connector #26404 includes e2e testing between the coordinator - native FlightShim connector - FlightShim server - multiple data sources. We have also tested internally in a full platform environment. ## Contributor checklist - [x] Please make sure your submission complies with our [contributing guide](https://github.com/prestodb/presto/blob/master/CONTRIBUTING.md), in particular [code style](https://github.com/prestodb/presto/blob/master/CONTRIBUTING.md#code-style) and [commit standards](https://github.com/prestodb/presto/blob/master/CONTRIBUTING.md#commit-standards). - [x] PR description addresses the issue accurately and concisely. If the change is non-trivial, a GitHub Issue is referenced. - [x] Documented new properties (with its default value), SQL syntax, functions, or other functionality. - [x] If release notes are required, they follow the [release notes guidelines](https://github.com/prestodb/presto/wiki/Release-Notes-Guidelines). - [x] Adequate tests were added if applicable. - [x] CI passed. - [x] If adding new dependencies, verified they have an [OpenSSF Scorecard](https://securityscorecards.dev/#the-checks) score of 5.0 or higher (or obtained explicit TSC approval for lower scores). ## Release Notes Please follow [release notes guidelines](https://github.com/prestodb/presto/wiki/Release-Notes-Guidelines) and fill in the release notes below. ``` == RELEASE NOTES == General Changes * Adding presto-flight-shim server module for connector federation. ``` Co-authored-by: Pratik Joseph Dabre <pdabre12@gmail.com>
756d79c to
2fe59b3
Compare
…workers (prestodb#26369) ## Description This adds a FlightShim module for connector federation. This includes an Apache Arrow Flight server that can load Presto Java connectors, and provide a record batch stream from the given connector split. The FlightShim server is designed to work with a native arrow connector, that will use a Flight client to forward the connector split to the server and process the record batch stream. See related PR at prestodb#26404 RFC: prestodb/rfcs#46 ## Motivation and Context <!---Why is this change required? What problem does it solve?--> <!---If it fixes an open issue, please link to the issue here.--> This provides a path to connector federation with native workers ## Impact <!---Describe any public API or user-facing feature change or any performance impact--> No API changes required. ## Test Plan <!---Please fill in how you tested your change--> Included unit tests for the FlightShim server functionality. The followup native connector prestodb#26404 includes e2e testing between the coordinator - native FlightShim connector - FlightShim server - multiple data sources. We have also tested internally in a full platform environment. ## Contributor checklist - [x] Please make sure your submission complies with our [contributing guide](https://github.com/prestodb/presto/blob/master/CONTRIBUTING.md), in particular [code style](https://github.com/prestodb/presto/blob/master/CONTRIBUTING.md#code-style) and [commit standards](https://github.com/prestodb/presto/blob/master/CONTRIBUTING.md#commit-standards). - [x] PR description addresses the issue accurately and concisely. If the change is non-trivial, a GitHub Issue is referenced. - [x] Documented new properties (with its default value), SQL syntax, functions, or other functionality. - [x] If release notes are required, they follow the [release notes guidelines](https://github.com/prestodb/presto/wiki/Release-Notes-Guidelines). - [x] Adequate tests were added if applicable. - [x] CI passed. - [x] If adding new dependencies, verified they have an [OpenSSF Scorecard](https://securityscorecards.dev/#the-checks) score of 5.0 or higher (or obtained explicit TSC approval for lower scores). ## Release Notes Please follow [release notes guidelines](https://github.com/prestodb/presto/wiki/Release-Notes-Guidelines) and fill in the release notes below. ``` == RELEASE NOTES == General Changes * Adding presto-flight-shim server module for connector federation. ``` Co-authored-by: Pratik Joseph Dabre <pdabre12@gmail.com>
b3690a0 to
e44b433
Compare
b214976 to
e7662b7
Compare
|
@SourceryAI review |
|
Sorry @pdabre12, your pull request is larger than the review limit of 150000 diff characters |
There was a problem hiding this comment.
Sorry @pdabre12, your pull request is larger than the review limit of 150000 diff characters
Description
Enable C++ workers to run queries through existing single-node Java connector implementations by introducing a dedicated Arrow Federation connector. This connector forwards requests to an Arrow Flight server shim, creating a practical migration path for connectors that have not yet been ported to C++, allowing users to run federated queries in a native cluster.
Motivation and Context
RFC: prestodb/rfcs#46
Impact
No impact
Test Plan
Includes e2e testing between the coordinator - native FlightShim connector - FlightShim server - multiple data sources. We have also tested internally in a full platform environment.
Contributor checklist
Release Notes
Please follow release notes guidelines and fill in the release notes below.