Skip to content

Python: Check for max_chunksize > 0 in Table.to_batches() #39788

@kylebarron

Description

@kylebarron

Describe the bug, including details regarding any error messages, version, and platform.

Table.to_batches should check for max_chunksize > 0 and raise an exception if not true. This is hard to debug because it creates an infinite loop where Jupyter just hangs.

import pyarrow as pa
table = pa.table({'a': [1, 2, 3, 4]})
table.to_batches(max_chunksize=0)
# hangs forever

It also led to OOM.
image

Component(s)

Python

Metadata

Metadata

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions