feat: Prism <> Iceberg [prism] Support SQL-standard time travel syntax (FOR TIMESTAMP AS OF) (#27421)#27421
Conversation
Reviewer's guide (collapsed on small PRs)Reviewer's GuideExtends Prism’s table version/time-travel support by wiring the SQL-standard Sequence diagram for FOR TIMESTAMP AS OF resolution in Prism connectorsequenceDiagram
actor User
participant PrestoParser
participant StatementAnalyzer
participant PrismMetadata
participant PrismMetastoreClient as Metastore
User->>PrestoParser: SQL with FOR TIMESTAMP AS OF
PrestoParser-->>StatementAnalyzer: Parsed query AST with TableVersion
StatementAnalyzer->>StatementAnalyzer: evaluateConstantExpression(stateExpression, stateExpressionType, metadata, session, parameters)
StatementAnalyzer->>StatementAnalyzer: validate type is TimestampType
StatementAnalyzer->>StatementAnalyzer: or TimestampWithTimeZoneType
StatementAnalyzer->>StatementAnalyzer: or BigintType or VarcharType
StatementAnalyzer-->>PrismMetadata: getTableHandle(session, schemaTableName, tableVersion)
PrismMetadata->>PrismMetadata: extractTimestampMillis(tableVersion)
PrismMetadata->>PrismMetadata: encodeTableNameWithVersion(schemaTableName, timestampMillis)
PrismMetadata->>Metastore: getTable(encodedSchemaTableName)
Metastore-->>PrismMetadata: PrismTable
PrismMetadata-->>StatementAnalyzer: TableHandle for snapshot
StatementAnalyzer-->>User: Query analyzed with time-travel table handle
Class diagram for updated PrismMetadata time travel supportclassDiagram
class ConnectorMetadata {
<<interface>>
+getTableHandle(session, schemaTableName) TableHandle
+getTableHandle(session, schemaTableName, tableVersion) TableHandle
}
class ConnectorTableVersion {
+versionType
+expression
}
class PrismMetadata {
+getTableHandle(session, schemaTableName) TableHandle
+getTableHandle(session, schemaTableName, tableVersion) TableHandle
-encodeTableNameWithVersion(schemaTableName, timestampMillis) SchemaTableName
-extractTimestampMillis(tableVersion) long
}
class PrismMetastoreClient {
+getTable(schemaTableName) PrismTable
}
class SchemaTableName {
+schemaName
+tableName
}
class TableHandle
class PrismTable {
+schemaTableName
+snapshotTimestampMillis
}
ConnectorMetadata <|.. PrismMetadata
PrismMetadata --> ConnectorTableVersion : uses
PrismMetadata --> PrismMetastoreClient : uses
PrismMetadata --> SchemaTableName : encodes
PrismMetastoreClient --> PrismTable : returns
PrismMetadata --> TableHandle : creates
Class diagram for StatementAnalyzer table version validationclassDiagram
class StatementAnalyzer {
-analysis
-metadata
-session
-evaluateConstantExpression(expression, type, metadata, session, parameters) Object
-processTableVersion(table, qualifiedObjectName, tableVersionType, stateExpression, stateExpressionType) Optional~TableHandle~
}
class Table
class QualifiedObjectName
class TableHandle
class TableVersionType {
<<enum>>
TIMESTAMP
SYSTEM_TIME
VERSION
}
class Type
class TimestampType
class TimestampWithTimeZoneType
class BigintType
class VarcharType
StatementAnalyzer --> Table : processes
StatementAnalyzer --> QualifiedObjectName : resolves
StatementAnalyzer --> TableHandle : returns
StatementAnalyzer --> TableVersionType : checks
StatementAnalyzer --> Type : uses
Type <|-- TimestampType
Type <|-- TimestampWithTimeZoneType
Type <|-- BigintType
Type <|-- VarcharType
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
There was a problem hiding this comment.
Hey - I've left some high level feedback:
- Now that BIGINT and VARCHAR are accepted for TIMESTAMP-based AS OF/BEFORE clauses, consider normalizing those constant values to a single internal representation (e.g., Timestamp/TimestampWithTimeZone) immediately after
evaluateConstantExpressionso downstream code does not have to handle additional Java types. - If VARCHAR is meant to support only specific timestamp formats or time zones, it may be safer to enforce that format here and fail fast with a targeted error message rather than relying on downstream parsing behavior.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- Now that BIGINT and VARCHAR are accepted for TIMESTAMP-based AS OF/BEFORE clauses, consider normalizing those constant values to a single internal representation (e.g., Timestamp/TimestampWithTimeZone) immediately after `evaluateConstantExpression` so downstream code does not have to handle additional Java types.
- If VARCHAR is meant to support only specific timestamp formats or time zones, it may be safer to enforce that format here and fail fast with a targeted error message rather than relying on downstream parsing behavior.Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
|
… in Prism connector (prestodb#27421) Summary: The Presto parser and analyzer already support the SQL-2011 temporal query syntax (`FOR TIMESTAMP AS OF`, `FOR SYSTEM_TIME AS OF`) via the `ConnectorTableVersion` SPI. The Iceberg connector implements it. The Prism connector does not, despite Metastore already supporting snapshot versioning through the `timestamp` table name encoding. This change: 1. Adds the 3-argument `getTableHandle(session, tableName, tableVersion)` override to `PrismMetadata` that converts the SQL time travel clause into the existing `timestamp` encoding 2. Extends `StatementAnalyzer` to accept `BIGINT` and `VARCHAR` expression types in `FOR TIMESTAMP AS OF` clauses (previously only `TimestampType` and `TimestampWithTimeZoneType` were allowed) This enables users to write clean time travel queries with multiple syntax options: ```sql -- Timestamp syntax (existing) SELECT * FROM dim_users FOR TIMESTAMP AS OF TIMESTAMP '2024-01-15 10:00:00' -- BIGINT epoch millis (new) SELECT * FROM dim_users FOR TIMESTAMP AS OF BIGINT '1705312800000' -- VARCHAR epoch millis string (new) SELECT * FROM dim_users FOR TIMESTAMP AS OF '1705312800000' ``` instead of the current session-property approach: ```sql SET SESSION prism.source_table_snapshots_enabled = true; SET SESSION prism.source_table_snapshots_timestamp_ms = 1705312800000; SELECT * FROM dim_users; ``` ## Benefits - **Per-table granularity**: Different tables in the same query can time-travel to different timestamps (impossible with session properties) - **Self-documenting**: Timestamp is in the SQL, not hidden in session state - **SQL-standard**: Compatible with Athena, Spark 3.3+, SQL-2011 - **Multiple input formats**: Accepts TIMESTAMP, TIMESTAMP WITH TIME ZONE, BIGINT epoch millis, or VARCHAR epoch millis strings - **No session state management**: Just write the query - **Backward compatible**: Existing session properties continue to work unchanged ## Supported SQL syntax - `FOR TIMESTAMP AS OF TIMESTAMP '2023-08-17 13:29:46.822 America/Los_Angeles'` (TimestampWithTimeZoneType) - `FOR TIMESTAMP AS OF TIMESTAMP '2023-08-17 13:29:46.822'` (TimestampType) - `FOR TIMESTAMP AS OF CURRENT_TIMESTAMP` (CurrentTime) - `FOR TIMESTAMP AS OF BIGINT '1624893589000'` (BIGINT epoch millis — NEW) - `FOR TIMESTAMP AS OF '1624893589000'` (VARCHAR epoch millis string — NEW) ## Implementation ### StatementAnalyzer (presto-trunk) - Modified type check in `FOR TIMESTAMP AS OF` validation to also accept `BigintType` and `VarcharType` (in addition to `TimestampType` and `TimestampWithTimeZoneType`) - Updated error message to list all supported types ### PrismMetadata (presto-facebook-trunk) - `PrismMetadata.getTableHandle(session, tableName, Optional<ConnectorTableVersion>)` — new 3-arg override - `PrismMetadata.encodeTableNameWithVersion(tableName, timestampMs)` — reuses existing `timestamp` encoding - `PrismMetadata.extractTimestampMillis(version)` — handles TimestampType, TimestampWithTimeZoneType, BigintType, and VarcharType - VERSION-based time travel and BEFORE operator are rejected with clear error messages Reference implementation: `IcebergAbstractMetadata.getSnapshotIdForTableVersion()` Differential Revision: D97600298
d43411b to
1f140a2
Compare
… in Prism connector (prestodb#27421) Summary: Pull Request resolved: prestodb#27421 The Presto parser and analyzer already support the SQL-2011 temporal query syntax (`FOR TIMESTAMP AS OF`, `FOR SYSTEM_TIME AS OF`) via the `ConnectorTableVersion` SPI. The Iceberg connector implements it. The Prism connector does not, despite Metastore already supporting snapshot versioning through the `timestamp` table name encoding. This change: 1. Adds the 3-argument `getTableHandle(session, tableName, tableVersion)` override to `PrismMetadata` that converts the SQL time travel clause into the existing `timestamp` encoding 2. Extends `StatementAnalyzer` to accept `BIGINT` and `VARCHAR` expression types in `FOR TIMESTAMP AS OF` clauses (previously only `TimestampType` and `TimestampWithTimeZoneType` were allowed) This enables users to write clean time travel queries with multiple syntax options: ```sql -- Timestamp syntax (existing) SELECT * FROM dim_users FOR TIMESTAMP AS OF TIMESTAMP '2024-01-15 10:00:00' -- BIGINT epoch millis (new) SELECT * FROM dim_users FOR TIMESTAMP AS OF BIGINT '1705312800000' -- VARCHAR epoch millis string (new) SELECT * FROM dim_users FOR TIMESTAMP AS OF '1705312800000' ``` instead of the current session-property approach: ```sql SET SESSION prism.source_table_snapshots_enabled = true; SET SESSION prism.source_table_snapshots_timestamp_ms = 1705312800000; SELECT * FROM dim_users; ``` ## Benefits - **Per-table granularity**: Different tables in the same query can time-travel to different timestamps (impossible with session properties) - **Self-documenting**: Timestamp is in the SQL, not hidden in session state - **SQL-standard**: Compatible with Athena, Spark 3.3+, SQL-2011 - **Multiple input formats**: Accepts TIMESTAMP, TIMESTAMP WITH TIME ZONE, BIGINT epoch millis, or VARCHAR epoch millis strings - **No session state management**: Just write the query - **Backward compatible**: Existing session properties continue to work unchanged ## Supported SQL syntax - `FOR TIMESTAMP AS OF TIMESTAMP '2023-08-17 13:29:46.822 America/Los_Angeles'` (TimestampWithTimeZoneType) - `FOR TIMESTAMP AS OF TIMESTAMP '2023-08-17 13:29:46.822'` (TimestampType) - `FOR TIMESTAMP AS OF CURRENT_TIMESTAMP` (CurrentTime) - `FOR TIMESTAMP AS OF BIGINT '1624893589000'` (BIGINT epoch millis — NEW) - `FOR TIMESTAMP AS OF '1624893589000'` (VARCHAR epoch millis string — NEW) ## Implementation ### StatementAnalyzer (presto-trunk) - Modified type check in `FOR TIMESTAMP AS OF` validation to also accept `BigintType` and `VarcharType` (in addition to `TimestampType` and `TimestampWithTimeZoneType`) - Updated error message to list all supported types ### PrismMetadata (presto-facebook-trunk) - `PrismMetadata.getTableHandle(session, tableName, Optional<ConnectorTableVersion>)` — new 3-arg override - `PrismMetadata.encodeTableNameWithVersion(tableName, timestampMs)` — reuses existing `timestamp` encoding - `PrismMetadata.extractTimestampMillis(version)` — handles TimestampType, TimestampWithTimeZoneType, BigintType, and VarcharType - VERSION-based time travel and BEFORE operator are rejected with clear error messages Reference implementation: `IcebergAbstractMetadata.getSnapshotIdForTableVersion()` Differential Revision: D97600298
1f140a2 to
5f4d1b7
Compare
… in Prism connector (prestodb#27421) Summary: The Presto parser and analyzer already support the SQL-2011 temporal query syntax (`FOR TIMESTAMP AS OF`, `FOR SYSTEM_TIME AS OF`) via the `ConnectorTableVersion` SPI. The Iceberg connector implements it. The Prism connector does not, despite Metastore already supporting snapshot versioning through the `timestamp` table name encoding. This change: 1. Adds the 3-argument `getTableHandle(session, tableName, tableVersion)` override to `PrismMetadata` that converts the SQL time travel clause into the existing `timestamp` encoding 2. Extends `StatementAnalyzer` to accept `BIGINT` and `VARCHAR` expression types in `FOR TIMESTAMP AS OF` clauses (previously only `TimestampType` and `TimestampWithTimeZoneType` were allowed) This enables users to write clean time travel queries with multiple syntax options: ```sql -- Timestamp syntax (existing) SELECT * FROM dim_users FOR TIMESTAMP AS OF TIMESTAMP '2024-01-15 10:00:00' -- BIGINT epoch millis (new) SELECT * FROM dim_users FOR TIMESTAMP AS OF BIGINT '1705312800000' -- VARCHAR epoch millis string (new) SELECT * FROM dim_users FOR TIMESTAMP AS OF '1705312800000' ``` instead of the current session-property approach: ```sql SET SESSION prism.source_table_snapshots_enabled = true; SET SESSION prism.source_table_snapshots_timestamp_ms = 1705312800000; SELECT * FROM dim_users; ``` ## Benefits - **Per-table granularity**: Different tables in the same query can time-travel to different timestamps (impossible with session properties) - **Self-documenting**: Timestamp is in the SQL, not hidden in session state - **SQL-standard**: Compatible with Athena, Spark 3.3+, SQL-2011 - **Multiple input formats**: Accepts TIMESTAMP, TIMESTAMP WITH TIME ZONE, BIGINT epoch millis, or VARCHAR epoch millis strings - **No session state management**: Just write the query - **Backward compatible**: Existing session properties continue to work unchanged ## Supported SQL syntax - `FOR TIMESTAMP AS OF TIMESTAMP '2023-08-17 13:29:46.822 America/Los_Angeles'` (TimestampWithTimeZoneType) - `FOR TIMESTAMP AS OF TIMESTAMP '2023-08-17 13:29:46.822'` (TimestampType) - `FOR TIMESTAMP AS OF CURRENT_TIMESTAMP` (CurrentTime) - `FOR TIMESTAMP AS OF BIGINT '1624893589000'` (BIGINT epoch millis — NEW) - `FOR TIMESTAMP AS OF '1624893589000'` (VARCHAR epoch millis string — NEW) ## Implementation ### StatementAnalyzer (presto-trunk) - Modified type check in `FOR TIMESTAMP AS OF` validation to also accept `BigintType` and `VarcharType` (in addition to `TimestampType` and `TimestampWithTimeZoneType`) - Updated error message to list all supported types ### PrismMetadata (presto-facebook-trunk) - `PrismMetadata.getTableHandle(session, tableName, Optional<ConnectorTableVersion>)` — new 3-arg override - `PrismMetadata.encodeTableNameWithVersion(tableName, timestampMs)` — reuses existing `timestamp` encoding - `PrismMetadata.extractTimestampMillis(version)` — handles TimestampType, TimestampWithTimeZoneType, BigintType, and VarcharType - VERSION-based time travel and BEFORE operator are rejected with clear error messages Reference implementation: `IcebergAbstractMetadata.getSnapshotIdForTableVersion()` Differential Revision: D97600298
5f4d1b7 to
2deb332
Compare
…AS OF) in Prism and Iceberg connector (prestodb#27421) Summary: The Presto parser and analyzer already support the SQL-2011 temporal query syntax (`FOR TIMESTAMP AS OF`, `FOR SYSTEM_TIME AS OF`) via the `ConnectorTableVersion` SPI. The Iceberg connector implements it. The Prism and iceberg connector does not, despite Metastore already supporting snapshot versioning through the `timestamp` table name encoding. This change: 1. Adds the 3-argument `getTableHandle(session, tableName, tableVersion)` override to `PrismMetadata` that converts the SQL time travel clause into the existing `timestamp` encoding 2. Extends `StatementAnalyzer` to accept `BIGINT` and `VARCHAR` expression types in `FOR TIMESTAMP AS OF` clauses (previously only `TimestampType` and `TimestampWithTimeZoneType` were allowed) This enables users to write clean time travel queries with multiple syntax options: ```sql -- Timestamp syntax (existing) SELECT * FROM dim_users FOR TIMESTAMP AS OF TIMESTAMP '2024-01-15 10:00:00' -- BIGINT epoch millis (new) SELECT * FROM dim_users FOR TIMESTAMP AS OF BIGINT '1705312800000' -- VARCHAR epoch millis string (new) SELECT * FROM dim_users FOR TIMESTAMP AS OF '1705312800000' ``` instead of the current session-property approach: ```sql SET SESSION prism.source_table_snapshots_enabled = true; SET SESSION prism.source_table_snapshots_timestamp_ms = 1705312800000; SELECT * FROM dim_users; ``` ## Benefits - **Per-table granularity**: Different tables in the same query can time-travel to different timestamps (impossible with session properties) - **Self-documenting**: Timestamp is in the SQL, not hidden in session state - **SQL-standard**: Compatible with Athena, Spark 3.3+, SQL-2011 - **Multiple input formats**: Accepts TIMESTAMP, TIMESTAMP WITH TIME ZONE, BIGINT epoch millis, or VARCHAR epoch millis strings - **No session state management**: Just write the query - **Backward compatible**: Existing session properties continue to work unchanged ## Supported SQL syntax - `FOR TIMESTAMP AS OF TIMESTAMP '2023-08-17 13:29:46.822 America/Los_Angeles'` (TimestampWithTimeZoneType) - `FOR TIMESTAMP AS OF TIMESTAMP '2023-08-17 13:29:46.822'` (TimestampType) - `FOR TIMESTAMP AS OF CURRENT_TIMESTAMP` (CurrentTime) - `FOR TIMESTAMP AS OF BIGINT '1624893589000'` (BIGINT epoch millis — NEW) - `FOR TIMESTAMP AS OF '1624893589000'` (VARCHAR epoch millis string — NEW) ## Implementation ### StatementAnalyzer (presto-trunk) - Modified type check in `FOR TIMESTAMP AS OF` validation to also accept `BigintType` and `VarcharType` (in addition to `TimestampType` and `TimestampWithTimeZoneType`) - Updated error message to list all supported types ### PrismMetadata (presto-facebook-trunk) - `PrismMetadata.getTableHandle(session, tableName, Optional<ConnectorTableVersion>)` — new 3-arg override - `PrismMetadata.encodeTableNameWithVersion(tableName, timestampMs)` — reuses existing `timestamp` encoding - `PrismMetadata.extractTimestampMillis(version)` — handles TimestampType, TimestampWithTimeZoneType, BigintType, and VarcharType - VERSION-based time travel and BEFORE operator are rejected with clear error messages Reference implementation: `IcebergAbstractMetadata.getSnapshotIdForTableVersion()` Differential Revision: D97600298
2deb332 to
117d806
Compare
… in Prism connector (prestodb#27421) Summary: The Presto parser and analyzer already support the SQL-2011 temporal query syntax (`FOR TIMESTAMP AS OF`, `FOR SYSTEM_TIME AS OF`) via the `ConnectorTableVersion` SPI. The Iceberg connector implements it. The Prism connector does not, despite Metastore already supporting snapshot versioning through the `timestamp` table name encoding. This change: 1. Adds the 3-argument `getTableHandle(session, tableName, tableVersion)` override to `PrismMetadata` that converts the SQL time travel clause into the existing `timestamp` encoding 2. Extends `StatementAnalyzer` to accept `BIGINT` and `VARCHAR` expression types in `FOR TIMESTAMP AS OF` clauses (previously only `TimestampType` and `TimestampWithTimeZoneType` were allowed) This enables users to write clean time travel queries with multiple syntax options: ```sql -- Timestamp syntax (existing) SELECT * FROM dim_users FOR TIMESTAMP AS OF TIMESTAMP '2024-01-15 10:00:00' -- BIGINT epoch millis (new) SELECT * FROM dim_users FOR TIMESTAMP AS OF BIGINT '1705312800000' -- VARCHAR epoch millis string (new) SELECT * FROM dim_users FOR TIMESTAMP AS OF '1705312800000' ``` instead of the current session-property approach: ```sql SET SESSION prism.source_table_snapshots_enabled = true; SET SESSION prism.source_table_snapshots_timestamp_ms = 1705312800000; SELECT * FROM dim_users; ``` ## Benefits - **Per-table granularity**: Different tables in the same query can time-travel to different timestamps (impossible with session properties) - **Self-documenting**: Timestamp is in the SQL, not hidden in session state - **SQL-standard**: Compatible with Athena, Spark 3.3+, SQL-2011 - **Multiple input formats**: Accepts TIMESTAMP, TIMESTAMP WITH TIME ZONE, BIGINT epoch millis, or VARCHAR epoch millis strings - **No session state management**: Just write the query - **Backward compatible**: Existing session properties continue to work unchanged ## Supported SQL syntax - `FOR TIMESTAMP AS OF TIMESTAMP '2023-08-17 13:29:46.822 America/Los_Angeles'` (TimestampWithTimeZoneType) - `FOR TIMESTAMP AS OF TIMESTAMP '2023-08-17 13:29:46.822'` (TimestampType) - `FOR TIMESTAMP AS OF CURRENT_TIMESTAMP` (CurrentTime) - `FOR TIMESTAMP AS OF BIGINT '1624893589000'` (BIGINT epoch millis — NEW) - `FOR TIMESTAMP AS OF '1624893589000'` (VARCHAR epoch millis string — NEW) ## Implementation ### StatementAnalyzer (presto-trunk) - Modified type check in `FOR TIMESTAMP AS OF` validation to also accept `BigintType` and `VarcharType` (in addition to `TimestampType` and `TimestampWithTimeZoneType`) - Updated error message to list all supported types ### PrismMetadata (presto-facebook-trunk) - `PrismMetadata.getTableHandle(session, tableName, Optional<ConnectorTableVersion>)` — new 3-arg override - `PrismMetadata.encodeTableNameWithVersion(tableName, timestampMs)` — reuses existing `timestamp` encoding - `PrismMetadata.extractTimestampMillis(version)` — handles TimestampType, TimestampWithTimeZoneType, BigintType, and VarcharType - VERSION-based time travel and BEFORE operator are rejected with clear error messages Reference implementation: `IcebergAbstractMetadata.getSnapshotIdForTableVersion()` Differential Revision: D97600298
117d806 to
b30c02c
Compare
…AS OF) in Prism connector (prestodb#27421) Summary: The Presto parser and analyzer already support the SQL-2011 temporal query syntax (`FOR TIMESTAMP AS OF`, `FOR SYSTEM_TIME AS OF`) via the `ConnectorTableVersion` SPI. The Iceberg connector implements it. The Prism connector does not, despite Metastore already supporting snapshot versioning through the `timestamp` table name encoding. == RELEASE NOTES == This change: 1. Adds the 3-argument `getTableHandle(session, tableName, tableVersion)` override to `PrismMetadata` that converts the SQL time travel clause into the existing `timestamp` encoding 2. Extends `StatementAnalyzer` to accept `BIGINT` and `VARCHAR` expression types in `FOR TIMESTAMP AS OF` clauses (previously only `TimestampType` and `TimestampWithTimeZoneType` were allowed) This enables users to write clean time travel queries with multiple syntax options: ```sql -- Timestamp syntax (existing) SELECT * FROM dim_users FOR TIMESTAMP AS OF TIMESTAMP '2024-01-15 10:00:00' -- BIGINT epoch millis (new) SELECT * FROM dim_users FOR TIMESTAMP AS OF BIGINT '1705312800000' -- VARCHAR epoch millis string (new) SELECT * FROM dim_users FOR TIMESTAMP AS OF '1705312800000' ``` instead of the current session-property approach: ```sql SET SESSION prism.source_table_snapshots_enabled = true; SET SESSION prism.source_table_snapshots_timestamp_ms = 1705312800000; SELECT * FROM dim_users; ``` ## Benefits - **Per-table granularity**: Different tables in the same query can time-travel to different timestamps (impossible with session properties) - **Self-documenting**: Timestamp is in the SQL, not hidden in session state - **SQL-standard**: Compatible with Athena, Spark 3.3+, SQL-2011 - **Multiple input formats**: Accepts TIMESTAMP, TIMESTAMP WITH TIME ZONE, BIGINT epoch millis, or VARCHAR epoch millis strings - **No session state management**: Just write the query - **Backward compatible**: Existing session properties continue to work unchanged ## Supported SQL syntax - `FOR TIMESTAMP AS OF TIMESTAMP '2023-08-17 13:29:46.822 America/Los_Angeles'` (TimestampWithTimeZoneType) - `FOR TIMESTAMP AS OF TIMESTAMP '2023-08-17 13:29:46.822'` (TimestampType) - `FOR TIMESTAMP AS OF CURRENT_TIMESTAMP` (CurrentTime) - `FOR TIMESTAMP AS OF BIGINT '1624893589000'` (BIGINT epoch millis — NEW) - `FOR TIMESTAMP AS OF '1624893589000'` (VARCHAR epoch millis string — NEW) ## Implementation ### StatementAnalyzer (presto-trunk) - Modified type check in `FOR TIMESTAMP AS OF` validation to also accept `BigintType` and `VarcharType` (in addition to `TimestampType` and `TimestampWithTimeZoneType`) - Updated error message to list all supported types ### PrismMetadata (presto-facebook-trunk) - `PrismMetadata.getTableHandle(session, tableName, Optional<ConnectorTableVersion>)` — new 3-arg override - `PrismMetadata.encodeTableNameWithVersion(tableName, timestampMs)` — reuses existing `timestamp` encoding - `PrismMetadata.extractTimestampMillis(version)` — handles TimestampType, TimestampWithTimeZoneType, BigintType, and VarcharType - VERSION-based time travel and BEFORE operator are rejected with clear error messages Reference implementation: `IcebergAbstractMetadata.getSnapshotIdForTableVersion()` Differential Revision: D97600298
b30c02c to
5cd4202
Compare
…AS OF) in Prism connector (prestodb#27421) Summary: X-link: https://github.com/facebookexternal/presto-facebook/pull/3608 The Presto parser and analyzer already support the SQL-2011 temporal query syntax (`FOR TIMESTAMP AS OF`, `FOR SYSTEM_TIME AS OF`) via the `ConnectorTableVersion` SPI. The Iceberg connector implements it. The Prism connector does not, despite Metastore already supporting snapshot versioning through the `timestamp` table name encoding. This change: 1. Adds the 3-argument `getTableHandle(session, tableName, tableVersion)` override to `PrismMetadata` that converts the SQL time travel clause into the existing `timestamp` encoding 2. Extends `StatementAnalyzer` to accept `BIGINT` and `VARCHAR` expression types in `FOR TIMESTAMP AS OF` clauses (previously only `TimestampType` and `TimestampWithTimeZoneType` were allowed) This enables users to write clean time travel queries with multiple syntax options: ```sql -- Timestamp syntax (existing) SELECT * FROM dim_users FOR TIMESTAMP AS OF TIMESTAMP '2024-01-15 10:00:00' -- BIGINT epoch millis (new) SELECT * FROM dim_users FOR TIMESTAMP AS OF BIGINT '1705312800000' -- VARCHAR epoch millis string (new) SELECT * FROM dim_users FOR TIMESTAMP AS OF '1705312800000' ``` instead of the current session-property approach: ```sql SET SESSION prism.source_table_snapshots_enabled = true; SET SESSION prism.source_table_snapshots_timestamp_ms = 1705312800000; SELECT * FROM dim_users; ``` ## Benefits - **Per-table granularity**: Different tables in the same query can time-travel to different timestamps (impossible with session properties) - **Self-documenting**: Timestamp is in the SQL, not hidden in session state - **SQL-standard**: Compatible with Athena, Spark 3.3+, SQL-2011 - **Multiple input formats**: Accepts TIMESTAMP, TIMESTAMP WITH TIME ZONE, BIGINT epoch millis, or VARCHAR epoch millis strings - **No session state management**: Just write the query - **Backward compatible**: Existing session properties continue to work unchanged ## Supported SQL syntax - `FOR TIMESTAMP AS OF TIMESTAMP '2023-08-17 13:29:46.822 America/Los_Angeles'` (TimestampWithTimeZoneType) - `FOR TIMESTAMP AS OF TIMESTAMP '2023-08-17 13:29:46.822'` (TimestampType) - `FOR TIMESTAMP AS OF CURRENT_TIMESTAMP` (CurrentTime) - `FOR TIMESTAMP AS OF BIGINT '1624893589000'` (BIGINT epoch millis — NEW) - `FOR TIMESTAMP AS OF '1624893589000'` (VARCHAR epoch millis string — NEW) ## Implementation ### StatementAnalyzer (presto-trunk) - Modified type check in `FOR TIMESTAMP AS OF` validation to also accept `BigintType` and `VarcharType` (in addition to `TimestampType` and `TimestampWithTimeZoneType`) - Updated error message to list all supported types ### PrismMetadata (presto-facebook-trunk) - `PrismMetadata.getTableHandle(session, tableName, Optional<ConnectorTableVersion>)` — new 3-arg override - `PrismMetadata.encodeTableNameWithVersion(tableName, timestampMs)` — reuses existing `timestamp` encoding - `PrismMetadata.extractTimestampMillis(version)` — handles TimestampType, TimestampWithTimeZoneType, BigintType, and VarcharType - VERSION-based time travel and BEFORE operator are rejected with clear error messages Reference implementation: `IcebergAbstractMetadata.getSnapshotIdForTableVersion()` == NO RELEASE NOTE == Differential Revision: D97600298
5cd4202 to
e3303c9
Compare
98a56db to
a2841ef
Compare
…x (FOR TIMESTAMP AS OF) (prestodb#27421) Summary: X-link: https://github.com/facebookexternal/presto-facebook/pull/3608 Pull Request resolved: prestodb#27421 The Presto parser and analyzer already support the SQL-2011 temporal query syntax (`FOR TIMESTAMP AS OF`, `FOR SYSTEM_TIME AS OF`) via the `ConnectorTableVersion` SPI. The Iceberg connector implements it. The Prism connector does not, despite Metastore already supporting snapshot versioning through the `timestamp` table name encoding. This change: 1. Adds the 3-argument `getTableHandle(session, tableName, tableVersion)` override to `PrismMetadata` that converts the SQL time travel clause into the existing `timestamp` encoding 2. Extends `StatementAnalyzer` to accept `BIGINT` and `VARCHAR` expression types in `FOR TIMESTAMP AS OF` clauses (previously only `TimestampType` and `TimestampWithTimeZoneType` were allowed) This enables users to write clean time travel queries with multiple syntax options: ```sql -- Timestamp syntax (existing) SELECT * FROM dim_users FOR TIMESTAMP AS OF TIMESTAMP '2024-01-15 10:00:00' -- BIGINT epoch millis (new) SELECT * FROM dim_users FOR TIMESTAMP AS OF BIGINT '1705312800000' -- VARCHAR epoch millis string (new) SELECT * FROM dim_users FOR TIMESTAMP AS OF '1705312800000' ``` instead of the current session-property approach: ```sql SET SESSION prism.source_table_snapshots_enabled = true; SET SESSION prism.source_table_snapshots_timestamp_ms = 1705312800000; SELECT * FROM dim_users; ``` ## Benefits - **Per-table granularity**: Different tables in the same query can time-travel to different timestamps (impossible with session properties) - **Self-documenting**: Timestamp is in the SQL, not hidden in session state - **SQL-standard**: Compatible with Athena, Spark 3.3+, SQL-2011 - **Multiple input formats**: Accepts TIMESTAMP, TIMESTAMP WITH TIME ZONE, BIGINT epoch millis, or VARCHAR epoch millis strings - **No session state management**: Just write the query - **Backward compatible**: Existing session properties continue to work unchanged ## Supported SQL syntax - `FOR TIMESTAMP AS OF TIMESTAMP '2023-08-17 13:29:46.822 America/Los_Angeles'` (TimestampWithTimeZoneType) - `FOR TIMESTAMP AS OF TIMESTAMP '2023-08-17 13:29:46.822'` (TimestampType) - `FOR TIMESTAMP AS OF CURRENT_TIMESTAMP` (CurrentTime) - `FOR TIMESTAMP AS OF BIGINT '1624893589000'` (BIGINT epoch millis — NEW) - `FOR TIMESTAMP AS OF '1624893589000'` (VARCHAR epoch millis string — NEW) ## Implementation ### StatementAnalyzer (presto-trunk) - Modified type check in `FOR TIMESTAMP AS OF` validation to also accept `BigintType` and `VarcharType` (in addition to `TimestampType` and `TimestampWithTimeZoneType`) - Updated error message to list all supported types ### PrismMetadata (presto-facebook-trunk) - `PrismMetadata.getTableHandle(session, tableName, Optional<ConnectorTableVersion>)` — new 3-arg override - `PrismMetadata.encodeTableNameWithVersion(tableName, timestampMs)` — reuses existing `timestamp` encoding - `PrismMetadata.extractTimestampMillis(version)` — handles TimestampType, TimestampWithTimeZoneType, BigintType, and VarcharType - VERSION-based time travel and BEFORE operator are rejected with clear error messages Reference implementation: `IcebergAbstractMetadata.getSnapshotIdForTableVersion()` == NO RELEASE NOTE == Differential Revision: D97600298
a2841ef to
677dcab
Compare
…x (FOR TIMESTAMP AS OF) (prestodb#27421) Summary: X-link: https://github.com/facebookexternal/presto-facebook/pull/3608 The Presto parser and analyzer already support the SQL-2011 temporal query syntax (`FOR TIMESTAMP AS OF`, `FOR SYSTEM_TIME AS OF`) via the `ConnectorTableVersion` SPI. The Iceberg connector implements it. The Prism connector does not, despite Metastore already supporting snapshot versioning through the `timestamp` table name encoding. This change: 1. Adds the 3-argument `getTableHandle(session, tableName, tableVersion)` override to `PrismMetadata` that converts the SQL time travel clause into the existing `timestamp` encoding 2. Extends `StatementAnalyzer` to accept `BIGINT` and `VARCHAR` expression types in `FOR TIMESTAMP AS OF` clauses (previously only `TimestampType` and `TimestampWithTimeZoneType` were allowed) This enables users to write clean time travel queries with multiple syntax options: ```sql -- Timestamp syntax (existing) SELECT * FROM dim_users FOR TIMESTAMP AS OF TIMESTAMP '2024-01-15 10:00:00' -- BIGINT epoch millis (new) SELECT * FROM dim_users FOR TIMESTAMP AS OF BIGINT '1705312800000' -- VARCHAR epoch millis string (new) SELECT * FROM dim_users FOR TIMESTAMP AS OF '1705312800000' ``` instead of the current session-property approach: ```sql SET SESSION prism.source_table_snapshots_enabled = true; SET SESSION prism.source_table_snapshots_timestamp_ms = 1705312800000; SELECT * FROM dim_users; ``` ## Benefits - **Per-table granularity**: Different tables in the same query can time-travel to different timestamps (impossible with session properties) - **Self-documenting**: Timestamp is in the SQL, not hidden in session state - **SQL-standard**: Compatible with Athena, Spark 3.3+, SQL-2011 - **Multiple input formats**: Accepts TIMESTAMP, TIMESTAMP WITH TIME ZONE, BIGINT epoch millis, or VARCHAR epoch millis strings - **No session state management**: Just write the query - **Backward compatible**: Existing session properties continue to work unchanged ## Supported SQL syntax - `FOR TIMESTAMP AS OF TIMESTAMP '2023-08-17 13:29:46.822 America/Los_Angeles'` (TimestampWithTimeZoneType) - `FOR TIMESTAMP AS OF TIMESTAMP '2023-08-17 13:29:46.822'` (TimestampType) - `FOR TIMESTAMP AS OF CURRENT_TIMESTAMP` (CurrentTime) - `FOR TIMESTAMP AS OF BIGINT '1624893589000'` (BIGINT epoch millis — NEW) - `FOR TIMESTAMP AS OF '1624893589000'` (VARCHAR epoch millis string — NEW) ## Implementation ### StatementAnalyzer (presto-trunk) - Modified type check in `FOR TIMESTAMP AS OF` validation to also accept `BigintType` and `VarcharType` (in addition to `TimestampType` and `TimestampWithTimeZoneType`) - Updated error message to list all supported types ### PrismMetadata (presto-facebook-trunk) - `PrismMetadata.getTableHandle(session, tableName, Optional<ConnectorTableVersion>)` — new 3-arg override - `PrismMetadata.encodeTableNameWithVersion(tableName, timestampMs)` — reuses existing `timestamp` encoding - `PrismMetadata.extractTimestampMillis(version)` — handles TimestampType, TimestampWithTimeZoneType, BigintType, and VarcharType - VERSION-based time travel and BEFORE operator are rejected with clear error messages Reference implementation: `IcebergAbstractMetadata.getSnapshotIdForTableVersion()` == NO RELEASE NOTE == Differential Revision: D97600298
677dcab to
856cbb6
Compare
…x (FOR TIMESTAMP AS OF) (prestodb#27421) Summary: X-link: https://github.com/facebookexternal/presto-facebook/pull/3608 Pull Request resolved: prestodb#27421 The Presto parser and analyzer already support the SQL-2011 temporal query syntax (`FOR TIMESTAMP AS OF`, `FOR SYSTEM_TIME AS OF`) via the `ConnectorTableVersion` SPI. The Iceberg connector implements it. The Prism connector does not, despite Metastore already supporting snapshot versioning through the `timestamp` table name encoding. This change: 1. Adds the 3-argument `getTableHandle(session, tableName, tableVersion)` override to `PrismMetadata` that converts the SQL time travel clause into the existing `timestamp` encoding 2. Extends `StatementAnalyzer` to accept `BIGINT` and `VARCHAR` expression types in `FOR TIMESTAMP AS OF` clauses (previously only `TimestampType` and `TimestampWithTimeZoneType` were allowed) This enables users to write clean time travel queries with multiple syntax options: ```sql -- Timestamp syntax (existing) SELECT * FROM dim_users FOR TIMESTAMP AS OF TIMESTAMP '2024-01-15 10:00:00' -- BIGINT epoch millis (new) SELECT * FROM dim_users FOR TIMESTAMP AS OF BIGINT '1705312800000' -- VARCHAR epoch millis string (new) SELECT * FROM dim_users FOR TIMESTAMP AS OF '1705312800000' ``` instead of the current session-property approach: ```sql SET SESSION prism.source_table_snapshots_enabled = true; SET SESSION prism.source_table_snapshots_timestamp_ms = 1705312800000; SELECT * FROM dim_users; ``` ## Benefits - **Per-table granularity**: Different tables in the same query can time-travel to different timestamps (impossible with session properties) - **Self-documenting**: Timestamp is in the SQL, not hidden in session state - **SQL-standard**: Compatible with Athena, Spark 3.3+, SQL-2011 - **Multiple input formats**: Accepts TIMESTAMP, TIMESTAMP WITH TIME ZONE, BIGINT epoch millis, or VARCHAR epoch millis strings - **No session state management**: Just write the query - **Backward compatible**: Existing session properties continue to work unchanged ## Supported SQL syntax - `FOR TIMESTAMP AS OF TIMESTAMP '2023-08-17 13:29:46.822 America/Los_Angeles'` (TimestampWithTimeZoneType) - `FOR TIMESTAMP AS OF TIMESTAMP '2023-08-17 13:29:46.822'` (TimestampType) - `FOR TIMESTAMP AS OF CURRENT_TIMESTAMP` (CurrentTime) - `FOR TIMESTAMP AS OF BIGINT '1624893589000'` (BIGINT epoch millis — NEW) - `FOR TIMESTAMP AS OF '1624893589000'` (VARCHAR epoch millis string — NEW) ## Implementation ### StatementAnalyzer (presto-trunk) - Modified type check in `FOR TIMESTAMP AS OF` validation to also accept `BigintType` and `VarcharType` (in addition to `TimestampType` and `TimestampWithTimeZoneType`) - Updated error message to list all supported types ### PrismMetadata (presto-facebook-trunk) - `PrismMetadata.getTableHandle(session, tableName, Optional<ConnectorTableVersion>)` — new 3-arg override - `PrismMetadata.encodeTableNameWithVersion(tableName, timestampMs)` — reuses existing `timestamp` encoding - `PrismMetadata.extractTimestampMillis(version)` — handles TimestampType, TimestampWithTimeZoneType, BigintType, and VarcharType - VERSION-based time travel and BEFORE operator are rejected with clear error messages Reference implementation: `IcebergAbstractMetadata.getSnapshotIdForTableVersion()` == NO RELEASE NOTE == Differential Revision: D97600298
856cbb6 to
ecd4d50
Compare
…x (FOR TIMESTAMP AS OF) (prestodb#27421) Summary: The Presto parser and analyzer already support the SQL-2011 temporal query syntax (`FOR TIMESTAMP AS OF`, `FOR SYSTEM_TIME AS OF`) via the `ConnectorTableVersion` SPI. The Iceberg connector implements it. The Prism connector does not, despite Metastore already supporting snapshot versioning through the `timestamp` table name encoding. This change: 1. Adds the 3-argument `getTableHandle(session, tableName, tableVersion)` override to `PrismMetadata` that converts the SQL time travel clause into the existing `timestamp` encoding 2. Extends `StatementAnalyzer` to accept `BIGINT` and `VARCHAR` expression types in `FOR TIMESTAMP AS OF` clauses (previously only `TimestampType` and `TimestampWithTimeZoneType` were allowed) This enables users to write clean time travel queries with multiple syntax options: ```sql -- Timestamp syntax (existing) SELECT * FROM dim_users FOR TIMESTAMP AS OF TIMESTAMP '2024-01-15 10:00:00' -- BIGINT epoch millis (new) SELECT * FROM dim_users FOR TIMESTAMP AS OF BIGINT '1705312800000' -- VARCHAR epoch millis string (new) SELECT * FROM dim_users FOR TIMESTAMP AS OF '1705312800000' ``` instead of the current session-property approach: ```sql SET SESSION prism.source_table_snapshots_enabled = true; SET SESSION prism.source_table_snapshots_timestamp_ms = 1705312800000; SELECT * FROM dim_users; ``` ## Benefits - **Per-table granularity**: Different tables in the same query can time-travel to different timestamps (impossible with session properties) - **Self-documenting**: Timestamp is in the SQL, not hidden in session state - **SQL-standard**: Compatible with Athena, Spark 3.3+, SQL-2011 - **Multiple input formats**: Accepts TIMESTAMP, TIMESTAMP WITH TIME ZONE, BIGINT epoch millis, or VARCHAR epoch millis strings - **No session state management**: Just write the query - **Backward compatible**: Existing session properties continue to work unchanged ## Supported SQL syntax - `FOR TIMESTAMP AS OF TIMESTAMP '2023-08-17 13:29:46.822 America/Los_Angeles'` (TimestampWithTimeZoneType) - `FOR TIMESTAMP AS OF TIMESTAMP '2023-08-17 13:29:46.822'` (TimestampType) - `FOR TIMESTAMP AS OF CURRENT_TIMESTAMP` (CurrentTime) - `FOR TIMESTAMP AS OF BIGINT '1624893589000'` (BIGINT epoch millis — NEW) - `FOR TIMESTAMP AS OF '1624893589000'` (VARCHAR epoch millis string — NEW) ## Implementation ### StatementAnalyzer (presto-trunk) - Modified type check in `FOR TIMESTAMP AS OF` validation to also accept `BigintType` and `VarcharType` (in addition to `TimestampType` and `TimestampWithTimeZoneType`) - Updated error message to list all supported types ### PrismMetadata (presto-facebook-trunk) - `PrismMetadata.getTableHandle(session, tableName, Optional<ConnectorTableVersion>)` — new 3-arg override - `PrismMetadata.encodeTableNameWithVersion(tableName, timestampMs)` — reuses existing `timestamp` encoding - `PrismMetadata.extractTimestampMillis(version)` — handles TimestampType, TimestampWithTimeZoneType, BigintType, and VarcharType - VERSION-based time travel and BEFORE operator are rejected with clear error messages Reference implementation: `IcebergAbstractMetadata.getSnapshotIdForTableVersion()` == NO RELEASE NOTE == Differential Revision: D97600298
ecd4d50 to
e15a27c
Compare
…x (FOR TIMESTAMP AS OF) (prestodb#27421) Summary: The Presto parser and analyzer already support the SQL-2011 temporal query syntax (`FOR TIMESTAMP AS OF`, `FOR SYSTEM_TIME AS OF`) via the `ConnectorTableVersion` SPI. The Iceberg connector implements it. The Prism connector does not, despite Metastore already supporting snapshot versioning through the `timestamp` table name encoding. This change: 1. Adds the 3-argument `getTableHandle(session, tableName, tableVersion)` override to `PrismMetadata` that converts the SQL time travel clause into the existing `timestamp` encoding 2. Extends `StatementAnalyzer` to accept `BIGINT` and `VARCHAR` expression types in `FOR TIMESTAMP AS OF` clauses (previously only `TimestampType` and `TimestampWithTimeZoneType` were allowed) This enables users to write clean time travel queries with multiple syntax options: ```sql -- Timestamp syntax (existing) SELECT * FROM dim_users FOR TIMESTAMP AS OF TIMESTAMP '2024-01-15 10:00:00' -- BIGINT epoch millis (new) SELECT * FROM dim_users FOR TIMESTAMP AS OF BIGINT '1705312800000' -- VARCHAR epoch millis string (new) SELECT * FROM dim_users FOR TIMESTAMP AS OF '1705312800000' ``` instead of the current session-property approach: ```sql SET SESSION prism.source_table_snapshots_enabled = true; SET SESSION prism.source_table_snapshots_timestamp_ms = 1705312800000; SELECT * FROM dim_users; ``` ## Benefits - **Per-table granularity**: Different tables in the same query can time-travel to different timestamps (impossible with session properties) - **Self-documenting**: Timestamp is in the SQL, not hidden in session state - **SQL-standard**: Compatible with Athena, Spark 3.3+, SQL-2011 - **Multiple input formats**: Accepts TIMESTAMP, TIMESTAMP WITH TIME ZONE, BIGINT epoch millis, or VARCHAR epoch millis strings - **No session state management**: Just write the query - **Backward compatible**: Existing session properties continue to work unchanged ## Supported SQL syntax - `FOR TIMESTAMP AS OF TIMESTAMP '2023-08-17 13:29:46.822 America/Los_Angeles'` (TimestampWithTimeZoneType) - `FOR TIMESTAMP AS OF TIMESTAMP '2023-08-17 13:29:46.822'` (TimestampType) - `FOR TIMESTAMP AS OF CURRENT_TIMESTAMP` (CurrentTime) - `FOR TIMESTAMP AS OF BIGINT '1624893589000'` (BIGINT epoch millis — NEW) - `FOR TIMESTAMP AS OF '1624893589000'` (VARCHAR epoch millis string — NEW) ## Implementation ### StatementAnalyzer (presto-trunk) - Modified type check in `FOR TIMESTAMP AS OF` validation to also accept `BigintType` and `VarcharType` (in addition to `TimestampType` and `TimestampWithTimeZoneType`) - Updated error message to list all supported types ### PrismMetadata (presto-facebook-trunk) - `PrismMetadata.getTableHandle(session, tableName, Optional<ConnectorTableVersion>)` — new 3-arg override - `PrismMetadata.encodeTableNameWithVersion(tableName, timestampMs)` — reuses existing `timestamp` encoding - `PrismMetadata.extractTimestampMillis(version)` — handles TimestampType, TimestampWithTimeZoneType, BigintType, and VarcharType - VERSION-based time travel and BEFORE operator are rejected with clear error messages Reference implementation: `IcebergAbstractMetadata.getSnapshotIdForTableVersion()` == NO RELEASE NOTE == Differential Revision: D97600298
e15a27c to
ab6cc99
Compare
…x (FOR TIMESTAMP AS OF) (prestodb#27421) Summary: X-link: https://github.com/facebookexternal/presto-facebook/pull/3608 Pull Request resolved: prestodb#27421 The Presto parser and analyzer already support the SQL-2011 temporal query syntax (`FOR TIMESTAMP AS OF`, `FOR SYSTEM_TIME AS OF`) via the `ConnectorTableVersion` SPI. The Iceberg connector implements it. The Prism connector does not, despite Metastore already supporting snapshot versioning through the `timestamp` table name encoding. This change: 1. Adds the 3-argument `getTableHandle(session, tableName, tableVersion)` override to `PrismMetadata` that converts the SQL time travel clause into the existing `timestamp` encoding 2. Extends `StatementAnalyzer` to accept `BIGINT` and `VARCHAR` expression types in `FOR TIMESTAMP AS OF` clauses (previously only `TimestampType` and `TimestampWithTimeZoneType` were allowed) This enables users to write clean time travel queries with multiple syntax options: ```sql -- Timestamp syntax (existing) SELECT * FROM dim_users FOR TIMESTAMP AS OF TIMESTAMP '2024-01-15 10:00:00' -- BIGINT epoch millis (new) SELECT * FROM dim_users FOR TIMESTAMP AS OF BIGINT '1705312800000' -- VARCHAR epoch millis string (new) SELECT * FROM dim_users FOR TIMESTAMP AS OF '1705312800000' ``` instead of the current session-property approach: ```sql SET SESSION prism.source_table_snapshots_enabled = true; SET SESSION prism.source_table_snapshots_timestamp_ms = 1705312800000; SELECT * FROM dim_users; ``` ## Benefits - **Per-table granularity**: Different tables in the same query can time-travel to different timestamps (impossible with session properties) - **Self-documenting**: Timestamp is in the SQL, not hidden in session state - **SQL-standard**: Compatible with Athena, Spark 3.3+, SQL-2011 - **Multiple input formats**: Accepts TIMESTAMP, TIMESTAMP WITH TIME ZONE, BIGINT epoch millis, or VARCHAR epoch millis strings - **No session state management**: Just write the query - **Backward compatible**: Existing session properties continue to work unchanged ## Supported SQL syntax - `FOR TIMESTAMP AS OF TIMESTAMP '2023-08-17 13:29:46.822 America/Los_Angeles'` (TimestampWithTimeZoneType) - `FOR TIMESTAMP AS OF TIMESTAMP '2023-08-17 13:29:46.822'` (TimestampType) - `FOR TIMESTAMP AS OF CURRENT_TIMESTAMP` (CurrentTime) - `FOR TIMESTAMP AS OF BIGINT '1624893589000'` (BIGINT epoch millis — NEW) - `FOR TIMESTAMP AS OF '1624893589000'` (VARCHAR epoch millis string — NEW) ## Implementation ### StatementAnalyzer (presto-trunk) - Modified type check in `FOR TIMESTAMP AS OF` validation to also accept `BigintType` and `VarcharType` (in addition to `TimestampType` and `TimestampWithTimeZoneType`) - Updated error message to list all supported types ### PrismMetadata (presto-facebook-trunk) - `PrismMetadata.getTableHandle(session, tableName, Optional<ConnectorTableVersion>)` — new 3-arg override - `PrismMetadata.encodeTableNameWithVersion(tableName, timestampMs)` — reuses existing `timestamp` encoding - `PrismMetadata.extractTimestampMillis(version)` — handles TimestampType, TimestampWithTimeZoneType, BigintType, and VarcharType - VERSION-based time travel and BEFORE operator are rejected with clear error messages Reference implementation: `IcebergAbstractMetadata.getSnapshotIdForTableVersion()` == NO RELEASE NOTE == Differential Revision: D97600298
ab6cc99 to
82042b5
Compare
…x (FOR TIMESTAMP AS OF) (prestodb#27421) Summary: X-link: https://github.com/facebookexternal/presto-facebook/pull/3608 The Presto parser and analyzer already support the SQL-2011 temporal query syntax (`FOR TIMESTAMP AS OF`, `FOR SYSTEM_TIME AS OF`) via the `ConnectorTableVersion` SPI. The Iceberg connector implements it. The Prism connector does not, despite Metastore already supporting snapshot versioning through the `timestamp` table name encoding. This change: 1. Adds the 3-argument `getTableHandle(session, tableName, tableVersion)` override to `PrismMetadata` that converts the SQL time travel clause into the existing `timestamp` encoding 2. Extends `StatementAnalyzer` to accept `BIGINT` and `VARCHAR` expression types in `FOR TIMESTAMP AS OF` clauses (previously only `TimestampType` and `TimestampWithTimeZoneType` were allowed) This enables users to write clean time travel queries with multiple syntax options: ```sql -- Timestamp syntax (existing) SELECT * FROM dim_users FOR TIMESTAMP AS OF TIMESTAMP '2024-01-15 10:00:00' -- BIGINT epoch millis (new) SELECT * FROM dim_users FOR TIMESTAMP AS OF BIGINT '1705312800000' -- VARCHAR epoch millis string (new) SELECT * FROM dim_users FOR TIMESTAMP AS OF '1705312800000' ``` instead of the current session-property approach: ```sql SET SESSION prism.source_table_snapshots_enabled = true; SET SESSION prism.source_table_snapshots_timestamp_ms = 1705312800000; SELECT * FROM dim_users; ``` ## Benefits - **Per-table granularity**: Different tables in the same query can time-travel to different timestamps (impossible with session properties) - **Self-documenting**: Timestamp is in the SQL, not hidden in session state - **SQL-standard**: Compatible with Athena, Spark 3.3+, SQL-2011 - **Multiple input formats**: Accepts TIMESTAMP, TIMESTAMP WITH TIME ZONE, BIGINT epoch millis, or VARCHAR epoch millis strings - **No session state management**: Just write the query - **Backward compatible**: Existing session properties continue to work unchanged ## Supported SQL syntax - `FOR TIMESTAMP AS OF TIMESTAMP '2023-08-17 13:29:46.822 America/Los_Angeles'` (TimestampWithTimeZoneType) - `FOR TIMESTAMP AS OF TIMESTAMP '2023-08-17 13:29:46.822'` (TimestampType) - `FOR TIMESTAMP AS OF CURRENT_TIMESTAMP` (CurrentTime) - `FOR TIMESTAMP AS OF BIGINT '1624893589000'` (BIGINT epoch millis — NEW) - `FOR TIMESTAMP AS OF '1624893589000'` (VARCHAR epoch millis string — NEW) ## Implementation ### StatementAnalyzer (presto-trunk) - Modified type check in `FOR TIMESTAMP AS OF` validation to also accept `BigintType` and `VarcharType` (in addition to `TimestampType` and `TimestampWithTimeZoneType`) - Updated error message to list all supported types ### PrismMetadata (presto-facebook-trunk) - `PrismMetadata.getTableHandle(session, tableName, Optional<ConnectorTableVersion>)` — new 3-arg override - `PrismMetadata.encodeTableNameWithVersion(tableName, timestampMs)` — reuses existing `timestamp` encoding - `PrismMetadata.extractTimestampMillis(version)` — handles TimestampType, TimestampWithTimeZoneType, BigintType, and VarcharType - VERSION-based time travel and BEFORE operator are rejected with clear error messages Reference implementation: `IcebergAbstractMetadata.getSnapshotIdForTableVersion()` == NO RELEASE NOTE == Differential Revision: D97600298
82042b5 to
63eaf87
Compare
…x (FOR TIMESTAMP AS OF) (prestodb#27421) Summary: X-link: https://github.com/facebookexternal/presto-facebook/pull/3608 Pull Request resolved: prestodb#27421 The Presto parser and analyzer already support the SQL-2011 temporal query syntax (`FOR TIMESTAMP AS OF`, `FOR SYSTEM_TIME AS OF`) via the `ConnectorTableVersion` SPI. The Iceberg connector implements it. The Prism connector does not, despite Metastore already supporting snapshot versioning through the `timestamp` table name encoding. This change: 1. Adds the 3-argument `getTableHandle(session, tableName, tableVersion)` override to `PrismMetadata` that converts the SQL time travel clause into the existing `timestamp` encoding 2. Extends `StatementAnalyzer` to accept `BIGINT` and `VARCHAR` expression types in `FOR TIMESTAMP AS OF` clauses (previously only `TimestampType` and `TimestampWithTimeZoneType` were allowed) This enables users to write clean time travel queries with multiple syntax options: ```sql -- Timestamp syntax (existing) SELECT * FROM dim_users FOR TIMESTAMP AS OF TIMESTAMP '2024-01-15 10:00:00' -- BIGINT epoch millis (new) SELECT * FROM dim_users FOR TIMESTAMP AS OF BIGINT '1705312800000' -- VARCHAR epoch millis string (new) SELECT * FROM dim_users FOR TIMESTAMP AS OF '1705312800000' ``` instead of the current session-property approach: ```sql SET SESSION prism.source_table_snapshots_enabled = true; SET SESSION prism.source_table_snapshots_timestamp_ms = 1705312800000; SELECT * FROM dim_users; ``` ## Benefits - **Per-table granularity**: Different tables in the same query can time-travel to different timestamps (impossible with session properties) - **Self-documenting**: Timestamp is in the SQL, not hidden in session state - **SQL-standard**: Compatible with Athena, Spark 3.3+, SQL-2011 - **Multiple input formats**: Accepts TIMESTAMP, TIMESTAMP WITH TIME ZONE, BIGINT epoch millis, or VARCHAR epoch millis strings - **No session state management**: Just write the query - **Backward compatible**: Existing session properties continue to work unchanged ## Supported SQL syntax - `FOR TIMESTAMP AS OF TIMESTAMP '2023-08-17 13:29:46.822 America/Los_Angeles'` (TimestampWithTimeZoneType) - `FOR TIMESTAMP AS OF TIMESTAMP '2023-08-17 13:29:46.822'` (TimestampType) - `FOR TIMESTAMP AS OF CURRENT_TIMESTAMP` (CurrentTime) - `FOR TIMESTAMP AS OF BIGINT '1624893589000'` (BIGINT epoch millis — NEW) - `FOR TIMESTAMP AS OF '1624893589000'` (VARCHAR epoch millis string — NEW) ## Implementation ### StatementAnalyzer (presto-trunk) - Modified type check in `FOR TIMESTAMP AS OF` validation to also accept `BigintType` and `VarcharType` (in addition to `TimestampType` and `TimestampWithTimeZoneType`) - Updated error message to list all supported types ### PrismMetadata (presto-facebook-trunk) - `PrismMetadata.getTableHandle(session, tableName, Optional<ConnectorTableVersion>)` — new 3-arg override - `PrismMetadata.encodeTableNameWithVersion(tableName, timestampMs)` — reuses existing `timestamp` encoding - `PrismMetadata.extractTimestampMillis(version)` — handles TimestampType, TimestampWithTimeZoneType, BigintType, and VarcharType - VERSION-based time travel and BEFORE operator are rejected with clear error messages Reference implementation: `IcebergAbstractMetadata.getSnapshotIdForTableVersion()` == NO RELEASE NOTE == Differential Revision: D97600298
…x (FOR TIMESTAMP AS OF) (prestodb#27421) Summary: X-link: https://github.com/facebookexternal/presto-facebook/pull/3608 Pull Request resolved: prestodb#27421 The Presto parser and analyzer already support the SQL-2011 temporal query syntax (`FOR TIMESTAMP AS OF`, `FOR SYSTEM_TIME AS OF`) via the `ConnectorTableVersion` SPI. The Iceberg connector implements it. The Prism connector does not, despite Metastore already supporting snapshot versioning through the `timestamp` table name encoding. This change: 1. Adds the 3-argument `getTableHandle(session, tableName, tableVersion)` override to `PrismMetadata` that converts the SQL time travel clause into the existing `timestamp` encoding 2. Extends `StatementAnalyzer` to accept `BIGINT` and `VARCHAR` expression types in `FOR TIMESTAMP AS OF` clauses (previously only `TimestampType` and `TimestampWithTimeZoneType` were allowed) This enables users to write clean time travel queries with multiple syntax options: ```sql -- Timestamp syntax (existing) SELECT * FROM dim_users FOR TIMESTAMP AS OF TIMESTAMP '2024-01-15 10:00:00' -- BIGINT epoch millis (new) SELECT * FROM dim_users FOR TIMESTAMP AS OF BIGINT '1705312800000' -- VARCHAR epoch millis string (new) SELECT * FROM dim_users FOR TIMESTAMP AS OF '1705312800000' ``` instead of the current session-property approach: ```sql SET SESSION prism.source_table_snapshots_enabled = true; SET SESSION prism.source_table_snapshots_timestamp_ms = 1705312800000; SELECT * FROM dim_users; ``` ## Benefits - **Per-table granularity**: Different tables in the same query can time-travel to different timestamps (impossible with session properties) - **Self-documenting**: Timestamp is in the SQL, not hidden in session state - **SQL-standard**: Compatible with Athena, Spark 3.3+, SQL-2011 - **Multiple input formats**: Accepts TIMESTAMP, TIMESTAMP WITH TIME ZONE, BIGINT epoch millis, or VARCHAR epoch millis strings - **No session state management**: Just write the query - **Backward compatible**: Existing session properties continue to work unchanged ## Supported SQL syntax - `FOR TIMESTAMP AS OF TIMESTAMP '2023-08-17 13:29:46.822 America/Los_Angeles'` (TimestampWithTimeZoneType) - `FOR TIMESTAMP AS OF TIMESTAMP '2023-08-17 13:29:46.822'` (TimestampType) - `FOR TIMESTAMP AS OF CURRENT_TIMESTAMP` (CurrentTime) - `FOR TIMESTAMP AS OF BIGINT '1624893589000'` (BIGINT epoch millis — NEW) - `FOR TIMESTAMP AS OF '1624893589000'` (VARCHAR epoch millis string — NEW) ## Implementation ### StatementAnalyzer (presto-trunk) - Modified type check in `FOR TIMESTAMP AS OF` validation to also accept `BigintType` and `VarcharType` (in addition to `TimestampType` and `TimestampWithTimeZoneType`) - Updated error message to list all supported types ### PrismMetadata (presto-facebook-trunk) - `PrismMetadata.getTableHandle(session, tableName, Optional<ConnectorTableVersion>)` — new 3-arg override - `PrismMetadata.encodeTableNameWithVersion(tableName, timestampMs)` — reuses existing `timestamp` encoding - `PrismMetadata.extractTimestampMillis(version)` — handles TimestampType, TimestampWithTimeZoneType, BigintType, and VarcharType - VERSION-based time travel and BEFORE operator are rejected with clear error messages Reference implementation: `IcebergAbstractMetadata.getSnapshotIdForTableVersion()` == NO RELEASE NOTE == Reviewed By: ghelmling Differential Revision: D97600298
|
@feilong-liu Can you merge this PR too, this has been internally merged at fb: https://github.com/facebookexternal/presto-facebook/commit/d4a19690896202720039a01cf8a88ab4f656f6c6 |
|
@apurva-meta Is there a reason we don't merge this PR? |
|
Maybe you don't have the permissions? |
|
Yes, I do not have the permissions to merge.
…On Mon, Apr 6, 2026 at 9:51 AM Kevin Tang ***@***.***> wrote:
Merged #27421 into master. — Reply to this email directly, view it on
GitHub, or unsubscribe. You are receiving this because you were mentioned.
Message ID: <prestodb/presto/pull/27421/issue_event/24230061682@ github.
com>
Merged #27421 <#27421> into master.
—
Reply to this email directly, view it on GitHub
<#27421 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/BYCDRT7TKAYDV3LZLE3XGCT4UPOABAVCNFSM6AAAAACW5EFXSOVHI2DSMVQWIX3LMV45UABCJFZXG5LFIV3GK3TUJZXXI2LGNFRWC5DJN5XDWMRUGIZTAMBWGE3DQMQ>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Summary:
X-link: https://github.com/facebookexternal/presto-facebook/pull/3608
The Presto parser and analyzer already support the SQL-2011 temporal query syntax (
FOR TIMESTAMP AS OF,FOR SYSTEM_TIME AS OF) via theConnectorTableVersionSPI. The Iceberg connector implements it. The Prism connector does not, despite Metastore already supporting snapshot versioning through thetimestamptable name encoding.This change:
getTableHandle(session, tableName, tableVersion)override toPrismMetadatathat converts the SQL time travel clause into the existingtimestampencodingStatementAnalyzerto acceptBIGINTandVARCHARexpression types inFOR TIMESTAMP AS OFclauses (previously onlyTimestampTypeandTimestampWithTimeZoneTypewere allowed)This enables users to write clean time travel queries with multiple syntax options:
instead of the current session-property approach:
Benefits
Supported SQL syntax
FOR TIMESTAMP AS OF TIMESTAMP '2023-08-17 13:29:46.822 America/Los_Angeles'(TimestampWithTimeZoneType)FOR TIMESTAMP AS OF TIMESTAMP '2023-08-17 13:29:46.822'(TimestampType)FOR TIMESTAMP AS OF CURRENT_TIMESTAMP(CurrentTime)FOR TIMESTAMP AS OF BIGINT '1624893589000'(BIGINT epoch millis — NEW)FOR TIMESTAMP AS OF '1624893589000'(VARCHAR epoch millis string — NEW)Implementation
StatementAnalyzer (presto-trunk)
FOR TIMESTAMP AS OFvalidation to also acceptBigintTypeandVarcharType(in addition toTimestampTypeandTimestampWithTimeZoneType)PrismMetadata (presto-facebook-trunk)
PrismMetadata.getTableHandle(session, tableName, Optional<ConnectorTableVersion>)— new 3-arg overridePrismMetadata.encodeTableNameWithVersion(tableName, timestampMs)— reuses existingtimestampencodingPrismMetadata.extractTimestampMillis(version)— handles TimestampType, TimestampWithTimeZoneType, BigintType, and VarcharTypeReference implementation:
IcebergAbstractMetadata.getSnapshotIdForTableVersion()== NO RELEASE NOTE ==
Reviewed By: ghelmling
Differential Revision: D97600298