Skip to content

adbc_ingest() is dropping rows in Snowflake #1847

@davlee1972

Description

@davlee1972

What happened?

I'm trying to load 98 million rows from a set of CSV files (5 year period), but only 95 to 96 million rows are getting inserted into Snowflake uisng adbc_ingest.. The distribution of missing data is pretty random and is around ~16k records per day.

I tried passing to adbc_ingest(), a pyarrow table and record batches.. In both cases rows are being dropped..

Here's a screenshot of my notebook code..
image

The odd thing is that sometimes it inserts 95 million rows and other times it inserts 96 million rows.. The total sum of inserted rows matches what I'm seeing in Snowflake logs if I add up all the rows created using COPY INTO sql commands..

It looks like we're not sending all the batches across the wire..

How can we reproduce the bug?

No response

Environment/Setup

Python 3.9.10 on RedHat 8 linux with ADBC drivers 0.10.0..

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type: bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions