Skip to content

[Java][Protocol] Collection serialization protocol by homogenization info #927

@chaokunyang

Description

@chaokunyang

Is your feature request related to a problem? Please describe.

In most cases, all collection elements are same type and not null, we should encode those info in advance to
elements header will encode those homogeneous to avoid the cost of writing it for every elements.

Describe the solution you'd like

Format:

length(positive varint) | collection header | elements header | elements data

collection header

  • For ArrayList/LinkedArrayList/HashSet/LinkedHashSet, this will be empty.
  • For TreeSet, this will be Comparator

elements header

In most cases, all collection elements are same type and not null, elements header will encode those homogeneous
information to avoid the cost of writing it for every elements. Specifically, there are four kinds of information
which will be encoded by elements header, each use one bit:

  • Whether track elements ref, use first bit 0b1 of header to flag it.
  • Whether collection has null, use second bit 0b10 of header to flag it. If ref tracking is enabled for this
    element type, this flag is invalid.
  • Whether collection elements type is not declare type, use 3rd bit 0b100 of header to flag it.
  • Whether collection elements type different, use 4rd bit 0b1000 of header to flag it.

By default, all bits are unset, which means all elements won't track ref, all elements are same type,, not null and the
actual element is the declare type in custom class field.

elements data

Based on the elements header, the serialization of elements data may skip ref flag/null flag/element class info.

Additional context

#925

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions