Skip to content

go/adbc/driver/flightsql: Default Value (10 MB) For adbc.snowflake.rpc.ingest_target_file_size Not Used In 1.1.0 #1997

@Zan-L

Description

@Zan-L

What happened?

Jobs calling adbc_ingestion() failed due to memory error. Upon checking, the data were split into {number of processor} parquet files, instead of those of ~10MB like 1.0.0.

Stack Trace

adbc_driver_manager.InternalError: INTERNAL: unknown error type: cannot allocate memory
cursor.adbc_ingest(table, data, mode)
File "/usr/local/lib/python3.12/site-packages/adbc_driver_manager/dbapi.py", line 937, in adbc_ingest
return _blocking_call(self._stmt.execute_update, (), {}, self._stmt.cancel)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "adbc_driver_manager/_lib.pyx", line 1569, in adbc_driver_manager._lib._blocking_call_impl
File "adbc_driver_manager/_lib.pyx", line 1562, in adbc_driver_manager._lib._blocking_call_impl
File "adbc_driver_manager/_lib.pyx", line 1295, in adbc_driver_manager._lib.AdbcStatement.execute_update
File "adbc_driver_manager/_lib.pyx", line 260, in adbc_driver_manager._lib.check_error

How can we reproduce the bug?

Unfortunately, I cannot share the data. However, it should be observed that in a four core VM, a dataset of moderate size (like 500 MB in parquet file size) will be split into four ~125MB files when adbc_ingest() is called to upload to Snowflake instead of fifty ~10MB files.

Environment/Setup

Packages:
adbc-driver-manager==1.1.0
adbc-driver-snowflake==1.1.0

Operating system: Windows/Linux

Package manager: pip

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type: bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions