Skip to content

SNOW-219884: Pandas datetime with timezone converts to timestamp_ntz in snowflake #199

@kychanbp

Description

@kychanbp

Please answer these questions before submitting your issue. Thanks!

  1. What version of Python are you using (python --version)?
    Python 3.8.2

  2. What operating system and processor architecture are you using (python -c 'import platform; print(platform.platform())')?
    macOS-10.15.7-x86_64-i386-64bit

  3. What are the component versions in the environment (pip list)?
    Package Version


asn1crypto 1.4.0
awswrangler 1.10.0
azure-common 1.1.25
azure-core 1.8.2
azure-storage-blob 12.5.0
boto3 1.15.18
botocore 1.18.18
certifi 2020.6.20
cffi 1.14.3
chardet 3.0.4
cryptography 3.2.1
idna 2.10
isodate 0.6.0
jmespath 0.10.0
msrest 0.6.19
numpy 1.19.4
oauthlib 3.1.0
oscrypto 1.2.1
packaging 20.4
pandas 1.1.4
pip 20.2.1
psycopg2-binary 2.8.6
pyarrow 2.0.0
pycparser 2.20
pycryptodomex 3.9.9
PyJWT 1.7.1
PyMySQL 0.10.1
pyOpenSSL 19.1.0
pyparsing 2.4.7
python-dateutil 2.8.1
pytz 2020.4
requests 2.23.0
requests-oauthlib 1.3.0
s3transfer 0.3.3
setuptools 50.2.0
six 1.15.0
snowflake-connector-python 2.3.5
snowflake-sqlalchemy 1.2.4
SQLAlchemy 1.3.20
sqlalchemy-redshift 0.8.1
urllib3 1.25.11
wheel 0.35.1

  1. What did you do?
    If possible, provide a recipe for reproducing the error.
    A complete runnable program is good.

I extract data from MySQL and constructed a pandas data frame. For example,

                     transaction_date                       created_on                      modified_on
0            2017-10-04 08:00:00+00:00        2018-10-05 18:56:39+08:00        2018-10-05 18:56:39+08:00
1            2017-10-04 08:00:00+00:00        2018-10-05 18:56:39+08:00        2018-10-05 18:56:39+08:00
2            2017-10-04 08:00:00+00:00        2018-10-05 18:56:39+08:00        2018-10-05 18:56:39+08:00
3            2017-10-04 08:00:00+00:00        2018-10-05 18:56:39+08:00        2018-10-05 18:56:39+08:00
4            2017-10-04 08:00:00+00:00        2018-10-05 18:56:39+08:00        2018-10-05 18:56:39+08:00
...                                ...                              ...                              ...
33934        2020-11-05 00:00:00+00:00 2020-11-06 18:45:03.681785+08:00 2020-11-06 18:45:03.711735+08:00
33935        2020-11-05 00:00:00+00:00 2020-11-06 18:45:03.685278+08:00 2020-11-06 18:45:18.441977+08:00
33936        2020-11-05 00:00:00+00:00 2020-11-06 18:45:03.688080+08:00 2020-11-06 18:45:21.579158+08:00
33937 2020-11-09 10:00:13.190445+00:00 2020-11-09 10:00:13.195200+08:00 2020-11-09 10:00:13.195237+08:00
33938        2020-11-09 10:28:57+00:00 2020-11-09 10:30:02.515862+08:00 2020-11-09 10:30:02.515884+08:00

the column types are:

transaction_date               datetime64[ns, UTC]
created_on          datetime64[ns, Asia/Hong_Kong]
modified_on         datetime64[ns, Asia/Hong_Kong]

I used to following code to load data into snowflake:
data.to_sql(target_table, conn, if_exists='replace',index=False, method=pd_writer)

  1. What did you expect to see?
    TIMESTAMP_TZ columns

  2. What did you see instead?
    TIMESTAMP_NTZ(9)

  3. Can you set logging to DEBUG and collect the logs?

import logging
import os

for logger_name in ['snowflake.sqlalchemy', 'snowflake.connector', 'botocore']: 
    logger = logging.getLogger(logger_name)
    logger.setLevel(logging.DEBUG)
    ch = logging.StreamHandler()
    ch.setLevel(logging.DEBUG)
    ch.setFormatter(logging.Formatter('%(asctime)s - %(threadName)s %(filename)s:%(lineno)d - %(funcName)s() - %(levelname)s - %(message)s'))
    logger.addHandler(ch)

Metadata

Metadata

Labels

bugSomething isn't workingstatus-triage_doneInitial triage done, will be further handled by the driver team

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions