Snowflake driver should map timezone-aware Arrow timestamps to TIMESTAMP_TZ, not TIMESTAMP_LTZ

# Bug Report

## Summary

When ingesting timezone-aware Arrow timestamps (e.g., `timestamp[us, tz=UTC]`) via the Snowflake ADBC driver, tables are automatically created with `TIMESTAMP_LTZ` columns instead of `TIMESTAMP_TZ` columns. This loses the explicit timezone information present in the Arrow data and causes temporal misalignment issues in multi-source pipelines.

## Related Issues

This is related to #2843, but specifically addresses **timezone-aware** timestamps where Arrow explicitly provides timezone metadata.

## What happened?

The Snowflake ADBC driver maps timezone-aware Arrow timestamps to `TIMESTAMP_LTZ` during table creation:
```
Arrow: timestamp[us, tz=UTC] → Snowflake: TIMESTAMP_LTZ
```

This is semantically incorrect because:
- **`TIMESTAMP_LTZ`** stores UTC values but interprets them relative to the **session timezone**
- **`TIMESTAMP_TZ`** stores UTC values **with explicit timezone metadata preserved**

When Arrow data explicitly specifies a timezone, that information should be preserved in Snowflake's `TIMESTAMP_TZ` type.

## What did you expect to happen?

Timezone-aware Arrow timestamps should map to `TIMESTAMP_TZ`:
```
Arrow: timestamp[us, tz=UTC] → Snowflake: TIMESTAMP_TZ
```

This preserves the explicit timezone information from the source data.

## Minimal Reproducible Example
```python
import polars as pol
from datetime import datetime, timezone
import adbc_driver_snowflake.dbapi

# Create DataFrame with explicit UTC timestamps
df = pol.DataFrame({
    "id": [1, 2, 3],
    "timestamp": [
        datetime(2024, 12, 3, 10, 0, 0, tzinfo=timezone.utc),
        datetime(2024, 12, 3, 11, 0, 0, tzinfo=timezone.utc),
        datetime(2024, 12, 3, 12, 0, 0, tzinfo=timezone.utc),
    ]
}).with_columns(
    pol.col("timestamp").dt.replace_time_zone("UTC").dt.cast_time_unit("us")
)

# Verify Arrow schema has timezone info
print(df.to_arrow().schema)
# timestamp: timestamp[us, tz=UTC]

# Push to Snowflake using ADBC
conn = adbc_driver_snowflake.dbapi.connect("<snowflake_uri>")
df.write_database(
    table_name="test_table",
    connection=conn,
    engine="adbc",
    if_table_exists="replace"
)
conn.close()
```

**Actual result in Snowflake:**
```sql
SHOW COLUMNS IN test_table;
-- timestamp column type: TIMESTAMP_LTZ(6)
```

**Expected result:**
```sql
-- timestamp column type: TIMESTAMP_TZ(6)
```

### Real-World Impact

This causes critical issues in multi-source data pipelines:

**Scenario:** Merging flight trajectory data from two sources (ACARS and SWIM)
- **Source A:** Data manually created with `TIMESTAMP_TZ` schema → timestamps always displayed as UTC
- **Source B:** Data ingested via ADBC → `TIMESTAMP_LTZ` created → timestamps displayed in session timezone

**Result:** When users in non-UTC timezones query and merge these datasets, the timestamps appear misaligned:
```python
# User's Snowflake session timezone: America/Los_Angeles (UTC-8)

# Query Source A (TIMESTAMP_TZ)
SELECT timestamp FROM source_a WHERE id = 1;
-- Returns: 2024-12-03 10:00:00 +0000 (explicit UTC)

# Query Source B (TIMESTAMP_LTZ) 
SELECT timestamp FROM source_b WHERE id = 1;
-- Returns: 2024-12-03 02:00:00 -0800 (converted to session TZ)

# Merge the sources
SELECT a.timestamp AS ts_a, b.timestamp AS ts_b 
FROM source_a a 
JOIN source_b b ON a.id = b.id;
-- ts_a: 2024-12-03 10:00:00 +0000
-- ts_b: 2024-12-03 02:00:00 -0800
-- These appear 8 hours apart even though they represent the same instant!
```

This breaks temporal joins, time-series analysis, and data quality validation.

## The Semantic Issue

When Arrow explicitly provides timezone metadata, the driver should preserve that information:

| Arrow Type | Current Mapping | Correct Mapping | Reasoning |
|------------|----------------|-----------------|-----------|
| `timestamp[us]` (no tz) | `TIMESTAMP_LTZ` | `TIMESTAMP_NTZ` or `TIMESTAMP_LTZ` | Ambiguous, either makes sense |
| `timestamp[us, tz=UTC]` | `TIMESTAMP_LTZ` ❌ | `TIMESTAMP_TZ` ✅ | Timezone is explicitly specified |

The key difference:
- **`TIMESTAMP_LTZ`**: "This is a UTC instant, interpret it in the user's local timezone"
- **`TIMESTAMP_TZ`**: "This is a UTC instant **and** the timezone (UTC) is part of the data"

When Arrow says `tz=UTC`, that timezone should be stored in Snowflake.

## Current Workaround

Pre-create tables with explicit schema before ingestion:
```python
cursor = conn.cursor()
cursor.execute("""
    CREATE TABLE test_table (
        id NUMBER(38, 0),
        timestamp TIMESTAMP_TZ
    )
""")
cursor.close()

# Then append data
df.write_database(
    table_name="test_table",
    connection=conn,
    engine="adbc",
    if_table_exists="append"
)
```

This works but requires boilerplate schema management and defeats the purpose of schema inference from Arrow types.

## Proposed Solution

**Option 1 (Preferred):** Change default mapping for timezone-aware timestamps
```
timestamp[us, tz=<any>] → TIMESTAMP_TZ
timestamp[us] (no tz)   → TIMESTAMP_LTZ (current behavior, unchanged)
```

**Option 2:** Add a connection/statement option
```python
# Allow users to specify timestamp type preference
conn = adbc_driver_snowflake.dbapi.connect(
    uri="<snowflake_uri>",
    db_kwargs={"adbc.snowflake.ingest.timestamp_tz_for_aware": "true"}
)
```

**Option 3:** Follow Arrow metadata conventions
If Arrow timestamp has timezone metadata, use `TIMESTAMP_TZ`. This aligns with Arrow's philosophy of preserving metadata.

## Environment

- **ADBC Driver:** `adbc-driver-snowflake` (latest from PyPI)
- **Arrow:** `pyarrow` (via Polars)
- **Language:** Python 3.11+
- **Framework:** Polars DataFrames
- **Snowflake:** Standard configuration

### Additional Context

The [Snowflake ADBC driver documentation](https://arrow.apache.org/adbc/current/driver/snowflake.html#type-support) states:

> **timestamp_ltz, timestamp_ntz, timestamp_tz**: timestamp[ns]
> Local time zone will be used, except for timestamp_ntz which is not an instant.

This describes the **read path** (Snowflake → Arrow) but doesn't specify the **write path** (Arrow → Snowflake) behavior for timezone-aware timestamps. The write path mapping should be symmetric and preserve timezone information when it exists in the source.

## Acknowledgment

I understand #2843 concluded that explicit schema creation is the recommended approach. However, I believe timezone-aware Arrow data represents a distinct case where the "correct" mapping is unambiguous: explicit timezone in Arrow should map to explicit timezone in Snowflake (`TIMESTAMP_TZ`), not session-dependent interpretation (`TIMESTAMP_LTZ`).

Even if the default behavior cannot change for backward compatibility, an opt-in option would greatly improve the developer experience for timezone-aware data pipelines.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Snowflake driver should map timezone-aware Arrow timestamps to TIMESTAMP_TZ, not TIMESTAMP_LTZ #3766

Bug Report

Summary

Related Issues

What happened?

What did you expect to happen?

Minimal Reproducible Example

Real-World Impact

The Semantic Issue

Current Workaround

Proposed Solution

Environment

Additional Context

Acknowledgment

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Arrow Type	Current Mapping	Correct Mapping	Reasoning
`timestamp[us]` (no tz)	`TIMESTAMP_LTZ`	`TIMESTAMP_NTZ` or `TIMESTAMP_LTZ`	Ambiguous, either makes sense
`timestamp[us, tz=UTC]`	`TIMESTAMP_LTZ` ❌	`TIMESTAMP_TZ` ✅	Timezone is explicitly specified

Snowflake driver should map timezone-aware Arrow timestamps to TIMESTAMP_TZ, not TIMESTAMP_LTZ #3766

Description

Bug Report

Summary

Related Issues

What happened?

What did you expect to happen?

Minimal Reproducible Example

Real-World Impact

The Semantic Issue

Current Workaround

Proposed Solution

Environment

Additional Context

Acknowledgment

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions