Skip to content

Use custom codecs in TPCH#26650

Draft
tdcmeehan wants to merge 3 commits into
prestodb:masterfrom
tdcmeehan:dyncon3
Draft

Use custom codecs in TPCH#26650
tdcmeehan wants to merge 3 commits into
prestodb:masterfrom
tdcmeehan:dyncon3

Conversation

@tdcmeehan

Copy link
Copy Markdown
Contributor

Description

Motivation and Context

Impact

Test Plan

Contributor checklist

  • Please make sure your submission complies with our contributing guide, in particular code style and commit standards.
  • PR description addresses the issue accurately and concisely. If the change is non-trivial, a GitHub Issue is referenced.
  • Documented new properties (with its default value), SQL syntax, functions, or other functionality.
  • If release notes are required, they follow the release notes guidelines.
  • Adequate tests were added if applicable.
  • CI passed.
  • If adding new dependencies, verified they have an OpenSSF Scorecard score of 5.0 or higher (or obtained explicit TSC approval for lower scores).

Release Notes

Please follow release notes guidelines and fill in the release notes below.

== RELEASE NOTES ==

General Changes
* ... 
* ... 

Hive Connector Changes
* ... 
* ... 

If release note is NOT required, use:

== NO RELEASE NOTE ==

@prestodb-ci prestodb-ci added the from:IBM PR from IBM label Nov 18, 2025

@sourcery-ai sourcery-ai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry @tdcmeehan, your pull request is larger than the review limit of 150000 diff characters

Add support for binary deserialization of connector handles through the
ConnectorProtocol interface. This enables connectors to provide custom
binary deserialization alongside the existing JSON support.
Add binary deserialization for TPCH connector handles in C++ to match
the Java serialization format. Includes TpchPrestoToVeloxConnector
for converting protocol objects to Velox representations.
20001020ycx added a commit to y-scope/presto that referenced this pull request Apr 22, 2026
…ization

Port binary deserialization routing from Tim's PR prestodb#26650 (commit 0ee0f05)
and extend it to cover all 5 handle types in both code paths:

- .cpp.inc files (Thrift deserialization path) - from PR prestodb#26650
- presto_protocol_core.cpp (JSON from_json path) - new addition

The JSON from_json path was missing: when Java sends handles with
customSerializedValue (binary serialization via Base64), the C++ side
fell through to getConnectorProtocol(type).from_json() which is NYI for
custom connectors like CLP.

Also includes ConnectorProtocol.h/.h changes adding serialize/deserialize
virtual methods and ColumnHandle serialize/deserialize stubs.
20001020ycx added a commit to y-scope/presto that referenced this pull request Apr 22, 2026
…ization

Port binary deserialization routing from Tim's PR prestodb#26650 (commit 0ee0f05)
and extend it to cover all 5 handle types in both code paths:

- .cpp.inc files (Thrift deserialization path) - from PR prestodb#26650
- presto_protocol_core.cpp (JSON from_json path) - new addition

The JSON from_json path was missing: when Java sends handles with
customSerializedValue (binary serialization via Base64), the C++ side
fell through to getConnectorProtocol(type).from_json() which is NYI for
custom connectors like CLP.

Also includes ConnectorProtocol.h/.h changes adding serialize/deserialize
virtual methods and ColumnHandle serialize/deserialize stubs.
20001020ycx added a commit to y-scope/presto that referenced this pull request Apr 22, 2026
…e types

Add customSerializedValue routing in from_json() for the 5 remaining
connector handle types from Tim's PR prestodb#26650:
- ConnectorDeleteTableHandle
- ConnectorIndexHandle
- ConnectorInsertTableHandle
- ConnectorOutputTableHandle
- ConnectorPartitioningHandle
20001020ycx added a commit to y-scope/presto that referenced this pull request Apr 22, 2026
Ports customSerializedValue routing from prestodb#26650 (0ee0f05)
for: ConnectorDeleteTableHandle, ConnectorIndexHandle,
ConnectorInsertTableHandle, ConnectorOutputTableHandle,
ConnectorPartitioningHandle.
20001020ycx added a commit to y-scope/presto that referenced this pull request Apr 22, 2026
Ports customSerializedValue routing from prestodb#26650 (0ee0f05)
for: ConnectorDeleteTableHandle, ConnectorIndexHandle,
ConnectorInsertTableHandle, ConnectorOutputTableHandle,
ConnectorPartitioningHandle.
20001020ycx added a commit to y-scope/presto that referenced this pull request Apr 22, 2026
Ports customSerializedValue routing from prestodb#26650 (0ee0f05)
for: ConnectorDeleteTableHandle, ConnectorIndexHandle,
ConnectorInsertTableHandle, ConnectorOutputTableHandle,
ConnectorPartitioningHandle.
20001020ycx added a commit to 20001020ycx/presto that referenced this pull request Apr 27, 2026
…ch style

Rename ClpCodecProvider → ClpConnectorCodecProvider to follow Tim's
TpchConnectorCodecProvider naming convention. Consolidate all 5 codec
classes into inner static classes within a single file. Adopt Tim's
variable naming style (cast to type-specific variable) and error
handling pattern (UncheckedIOException with plain try blocks).

Follows the approach proposed in prestodb#26650.
20001020ycx added a commit to 20001020ycx/presto that referenced this pull request Apr 27, 2026
Move each codec class out of ClpConnectorCodecProvider into its own
file. ClpConnectorCodecProvider now only provides the connector-level
factory methods. Each codec still follows Tim's style from prestodb#26650
(cast to type-specific variable, UncheckedIOException, plain try).
20001020ycx added a commit to y-scope/presto that referenced this pull request Apr 29, 2026
Implement binary deserialization for all 5 CLP connector handle types
(ClpColumnHandle, ClpSplit, ClpTableHandle, ClpTableLayoutHandle,
ClpTransactionHandle) by inheriting ConnectorProtocol and overriding
deserialize().

Reads binary bytes (Java DataOutputStream big-endian wire format from
#158) directly into existing protocol structs using std::istringstream
and folly::Endian::big.

Follows Tim's TpchConnectorProtocol pattern from prestodb#26650.
20001020ycx added a commit to y-scope/presto that referenced this pull request Apr 30, 2026
Implement binary deserialization for all 5 CLP connector handle types
(ClpColumnHandle, ClpSplit, ClpTableHandle, ClpTableLayoutHandle,
ClpTransactionHandle) by inheriting ConnectorProtocol and overriding
deserialize().

Reads binary bytes (Java DataOutputStream big-endian wire format from
#158) directly into existing protocol structs using std::istringstream
and folly::Endian::big.

Follows Tim's TpchConnectorProtocol pattern from prestodb#26650.
20001020ycx added a commit to y-scope/presto that referenced this pull request Apr 30, 2026
Implement binary deserialization for all 5 CLP connector handle types
(ClpColumnHandle, ClpSplit, ClpTableHandle, ClpTableLayoutHandle,
ClpTransactionHandle) by inheriting ConnectorProtocol and overriding
deserialize().

Reads binary bytes (Java DataOutputStream big-endian wire format from
#158) directly into existing protocol structs using std::istringstream
and folly::Endian::big.

Follows Tim's TpchConnectorProtocol pattern from prestodb#26650.
20001020ycx added a commit to y-scope/presto that referenced this pull request Apr 30, 2026
Implement binary deserialization for all 5 CLP connector handle types
(ClpColumnHandle, ClpSplit, ClpTableHandle, ClpTableLayoutHandle,
ClpTransactionHandle) by inheriting ConnectorProtocol and overriding
deserialize().

Reads binary bytes (Java DataOutputStream big-endian wire format from
#158) directly into existing protocol structs using std::istringstream
and folly::Endian::big.

Follows Tim's TpchConnectorProtocol pattern from prestodb#26650.
20001020ycx added a commit to y-scope/presto that referenced this pull request Apr 30, 2026
Implement binary deserialization for all 5 CLP connector handle types
(ClpColumnHandle, ClpSplit, ClpTableHandle, ClpTableLayoutHandle,
ClpTransactionHandle) by inheriting ConnectorProtocol and overriding
deserialize().

Reads binary bytes (Java DataOutputStream big-endian wire format from
#158) directly into existing protocol structs using std::istringstream
and folly::Endian::big.

Follows Tim's TpchConnectorProtocol pattern from prestodb#26650.
20001020ycx added a commit to y-scope/presto that referenced this pull request Apr 30, 2026
Implement binary deserialization for all 5 CLP connector handle types
(ClpColumnHandle, ClpSplit, ClpTableHandle, ClpTableLayoutHandle,
ClpTransactionHandle) by inheriting ConnectorProtocol and overriding
deserialize().

Reads binary bytes (Java DataOutputStream big-endian wire format from
#158) directly into existing protocol structs using std::istringstream
and folly::Endian::big.

Follows Tim's TpchConnectorProtocol pattern from prestodb#26650.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

from:IBM PR from IBM

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants