-
Notifications
You must be signed in to change notification settings - Fork 3.7k
Closed
Labels
type/enhancementThe enhancements for the existing features or docs. e.g. reduce memory usage of the delayed messagesThe enhancements for the existing features or docs. e.g. reduce memory usage of the delayed messages
Milestone
Description
In Pulsar we are storing a lot of metadata in ZooKeeper using different formats:
- BookKeeper ledgers: Protobuf Text
- Managed Ledgers and cursors: Protobuf Text
- Broker and namespace bundles load reports: JSON
Using text formats has been good for quick debugging sessions without special tools but has drawbacks:
- Size of data stored in ZK can be significant when many topics (>1M) are active in a cluster. Protobuf text format is like json and needs to repeat all the field names each time.
- Speed of serializing/deserializing (binary formats are always faster to parse)
- Garbage generated (with binary format we could switch to the custom protobuf code generator to generate reusable objects)
- Backward compatibility. Text protobuf is not backward compatible (unlike the binary parser), it will fail to parse unknown fields (and there's no way to change that). This makes very difficult to change the format (typically we would do 1 release that can understand the new format but still writes the old one, then next release to write new format). Backward compatibility is key to ensure we can rollback a release if some issue is detected during deployment.
Of the 3 categories listed above, I don't think we should bother about load reports, because they're not where the bulk of metadata is.
My proposal would be:
-
1.17 release:
- Add the code to read both formats
- A config switch to enable writing binary format for ML and cursors data in ZK, with default to text format.
- Add tools to dump the content of a ML for human consumption
-
1.18 release:
- Make binary default
- Remove config switch for text/binary
Once the change has been implemented it would be easy to pre-verify the size difference and eventually think of storing even BK ledgers in binary format.
Metadata
Metadata
Assignees
Labels
type/enhancementThe enhancements for the existing features or docs. e.g. reduce memory usage of the delayed messagesThe enhancements for the existing features or docs. e.g. reduce memory usage of the delayed messages