Skip to content

llm-text column generation may lead to pyarrow backend errors #232

@nabinchha

Description

@nabinchha

Priority Level

Medium (Annoying but has workaround)

Describe the bug

This is likely coming from here.

If the llm ends up responding with deserializable value sometimes (depending on the workflow). This column could end up with string and list values leading to the error.

Unable to convert the dataset to a PyArrow backend
  |-- Conversion Error Message: ("Could not convert 'Not 'answer'able' with type str: tried to convert to double", 'Conversion failed for column 'answer' with type object')
  |-- This is often due to at least one column having mixed data types
  |-- Note: Reported data types will be inferred from the first non-null value of each column

Steps/Code to reproduce bug

Use an LLMTextColumnConfig to generate answers based on the question. The answer could be a list sometimes.

Expected behavior

When using LLMTextColumnConfig, we shouldn't run into these issues. The responses should always be treated as strings.

Additional context

No response

Metadata

Metadata

Assignees

Labels

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions