Skip to content

Parquet - Boolean column dictionary encoding #695

@norberttech

Description

@norberttech

Currently boolean columns are automatically encoded with RLE_DICTIONARY, there is nothing wrong with that as parquet specification is not limiting that in any way but...

  • Apache spark is not supporting that
  • It brings very little value as booleans can anyway be RLE encoded

We should put a rule in the code that would use the dictionary encoding on any other column type except Booleans.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    Status

    Done

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions