Skip to content

[Format][Docs] Document IPC Compression #37756

@ivirshup

Description

@ivirshup

Describe the enhancement requested

As discussed with @jorisvandenbossche at the NumFOCUS summit (thanks for pointing me to Message.fbs!).

At the moment, there is no documentation of how compression works with the IPC format in the format docs (e.g. https://arrow.apache.org/docs/format/Columnar.html). It would be quite helpful if this was documented. There is some documentation of this in the Message.fbs file, but it should probably be part of the formal docs as well.

Some questions I have that I think prose documentation could answer:

  • Is it correct that each buffer (in the flattened tree) is compressed separately, but the exact same compression must be used in each case?
  • Can we randomly access a compressed buffer within a RecordBatch because the Buffer.offset accounts for compression? because flatbuffers should provide us with information about the lengths of the compressed Buffers?

Component(s)

Documentation, Format

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions