Skip to content

Auto Cast / Auto Detect Types #921

@norberttech

Description

@norberttech

Right now, when loading data from formats like CSV or JSON everything is converted into StringEntry,
it's because those formats does not have any kind of schema.
It would be very beneficial to create a mechanism that when used would try to guess best possible type based on data value and cast entry into this.

For example:

  • "true" -> boolean
  • "1" -> int
  • "2023-03-04" -> DateTime

I see it as a custom transformer, something that can be used like this:

df()
   ->read(from_csv('...'))
   ->autoCast() 
   ->write(to_parquet('...'))
   ->run(); 

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    Status

    Done

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions