Unibuf is a pure Ruby gem for parsing and manipulating multiple serialization formats including Protocol Buffers, FlatBuffers, and Cap’n Proto.
It provides fully object-oriented, specification-compliant parsers with rich domain models, comprehensive schema validation, binary format encoding/decoding, and complete round-trip serialization support.
Key features:
-
Protocol Buffers
-
Parse text format (
.txtpb,.textproto) -
Parse binary format (
.binpb) with schema -
Serialize to binary format (
.binpb) -
Parse Proto3 schemas (
.proto) -
Wire format encoding/decoding (varint, zigzag, all wire types)
-
-
FlatBuffers
-
Parse schemas (
.fbs) -
Parse binary format (
.fb) -
Serialize to binary format (
.fb)
-
-
Cap’n Proto
-
Parse schemas (
.capnp) -
Parse binary format with segment management
-
Serialize to binary format with pointer encoding
-
Support for structs, enums, interfaces (RPC)
-
Generic types (List<T>)
-
Unions and annotations
-
-
Serialization and validation
-
Complete round-trip serialization for all formats
-
Schema-driven validation and deserialization
-
-
Developer usage
-
Rich domain models with 60+ behavioral classes
-
Complete CLI toolkit for all formats
-
Pure Ruby - no C/C++ dependencies
-
Add this line to your application’s Gemfile:
gem "unibuf"And then execute:
bundle installOr install it yourself as:
gem install unibufFull support for Protocol Buffers (protobuf) including text format parsing, binary format parsing/serialization, and Proto3 schema parsing.
See PROTOBUF.adoc for detailed documentation.
require "unibuf"
# Load schema (recommended for validation)
schema = Unibuf.parse_schema("schema.proto") # (1)
# Parse text format file
message = Unibuf.parse_textproto_file("data.txtpb") # (2)
# Validate against schema
validator = Unibuf::Validators::SchemaValidator.new(schema) # (3)
validator.validate!(message, "MessageType") # (4)-
Load Proto3 schema from .proto file
-
Parse Protocol Buffers text format
-
Create validator with schema
-
Validate message against schema
require "unibuf"
# 1. Load schema (REQUIRED for binary)
schema = Unibuf.parse_schema("schema.proto") # (1)
# 2. Parse binary Protocol Buffer file
message = Unibuf.parse_binary_file("data.binpb", schema: schema) # (2)
# 3. Access fields normally
puts message.find_field("name").value # (3)-
Schema is mandatory for binary parsing
-
Parse binary file with schema
-
Access fields like text format
Complete support for Google FlatBuffers including schema parsing (.fbs files)
and binary format parsing/serialization.
See FLATBUFFERS.adoc for detailed documentation.
require "unibuf"
# Parse FlatBuffers schema
schema = Unibuf.parse_flatbuffers_schema("schema.fbs") # (1)
# Access schema structure
table = schema.find_table("Monster") # (2)
table.fields.each { |f| puts "#{f.name}: #{f.type}" } # (3)-
Parse
.fbsschema file -
Find table definition
-
Iterate through fields
Complete support for Cap’n Proto including schema parsing (.capnp files) and
binary format parsing/serialization with segment management and pointer
encoding.
See CAPNPROTO.adoc for detailed documentation.
require "unibuf"
# Parse Cap'n Proto schema
schema = Unibuf.parse_capnproto_schema("addressbook.capnp") # (1)
# Access schema structure
person = schema.find_struct("Person") # (2)
person.fields.each { |f| puts "#{f.name} @#{f.ordinal} :#{f.type}" } # (3)
# Access interfaces (RPC)
calc = schema.find_interface("Calculator") # (4)
calc.methods.each { |m| puts "#{m.name} @#{m.ordinal}" } # (5)-
Parse
.capnpschema file -
Find struct definition
-
Iterate through fields with ordinals
-
Find interface definition (RPC)
-
List RPC methods
# Parse binary Cap'n Proto data
parser = Unibuf::Parsers::Capnproto::BinaryParser.new(schema) # (1)
data = parser.parse(binary_data, root_type: "Person") # (2)
# Access data
puts data[:name] # (3)
puts data[:email] # (4)-
Create parser with schema
-
Parse binary with root type
-
Access text field
-
Access another field
# Serialize to binary
serializer = Unibuf::Serializers::Capnproto::BinarySerializer.new(schema) # (1)
binary = serializer.serialize(
{ id: 1, name: "Alice", email: "alice@example.com" }, # (2)
root_type: "Person" # (3)
)
# Write to file
File.binwrite("output.capnp.bin", binary) # (4)-
Create serializer with schema
-
Provide data as hash
-
Specify root struct type
-
Write binary output
Parse human-readable Protocol Buffer text format files following the official specification.
See TXTPROTO.adoc for detailed documentation.
require "unibuf"
# Load schema (recommended for validation)
schema = Unibuf.parse_schema("schema.proto") # (1)
# Parse text format file
message = Unibuf.parse_textproto_file("data.txtpb") # (2)
# Validate against schema
validator = Unibuf::Validators::SchemaValidator.new(schema) # (3)
validator.validate!(message, "MessageType") # (4)-
Load Proto3 schema from .proto file
-
Parse Protocol Buffers text format
-
Create validator with schema
-
Validate message against schema
Parse binary Protocol Buffer data using wire format decoding with schema-driven deserialization.
The schema is REQUIRED for binary parsing because binary format only stores field numbers, not names or types.
require "unibuf"
# 1. Load schema (REQUIRED for binary)
schema = Unibuf.parse_schema("schema.proto") # (1)
# 2. Parse binary Protocol Buffer file
message = Unibuf.parse_binary_file("data.binpb", schema: schema) # (2)
# 3. Access fields normally
puts message.find_field("name").value # (3)-
Schema is mandatory for binary parsing
-
Parse binary file with schema
-
Access fields like text format
# Read binary data
binary_data = File.binread("data.binpb")
# Parse with schema
schema = Unibuf.parse_schema("schema.proto")
message = Unibuf.parse_binary(binary_data, schema: schema)The binary parser supports all Protocol Buffer wire types:
- Varint (Type 0)
-
Variable-length integers: int32, int64, uint32, uint64, sint32, sint64, bool, enum
- 64-bit (Type 1)
-
Fixed 8-byte values: fixed64, sfixed64, double
- Length-delimited (Type 2)
-
Variable-length data: string, bytes, embedded messages, packed repeated fields
- 32-bit (Type 5)
-
Fixed 4-byte values: fixed32, sfixed32, float
Unibuf implements complete Protocol Buffers wire format decoding according to the official specification.
- Varint decoding
-
Efficiently decode variable-length integers used for most numeric types
- ZigZag encoding
-
Proper handling of signed integers (sint32, sint64) with zigzag decoding
- Fixed-width types
-
Decode 32-bit and 64-bit fixed-width values (fixed32, fixed64, float, double)
- Length-delimited
-
Parse strings, bytes, and embedded messages with length prefixes
- Schema-driven
-
Use schema to determine field types and deserialize correctly
# Schema defines the structure
schema = Unibuf.parse_schema("schema.proto")
# Binary data uses wire format encoding
binary_data = File.binread("data.binpb")
# Parser uses schema to decode wire format
message = Unibuf.parse_binary(binary_data, schema: schema)
# Access decoded fields
message.field_names # => ["name", "id", "enabled"]
message.find_field("id").value # => Properly decoded integerUnibuf follows Protocol Buffers' and FlatBuffers' schema-driven architecture.
The schema (.proto or .fbs file) defines the message structure and is
REQUIRED for binary parsing and serialization.
This design ensures type safety and enables proper deserialization of binary formats.
The schema defines:
-
Message/struct types and their fields
-
Field types, numbers, and ordinals
-
Field wire types for binary encoding
-
Repeated and optional fields
-
Nested message/struct structures
Binary Protocol Buffers, FlatBuffers, and Cap’n Proto cannot be parsed without a schema because the binary formats only store field identifiers, not field names or complete type information.
# Load schema
schema = Unibuf.parse_schema("schema.proto") # (1)
# Parse message (text or binary)
message = Unibuf.parse_binary_file("data.binpb", schema: schema) # (2)
# Validate
validator = Unibuf::Validators::SchemaValidator.new(schema) # (3)
errors = validator.validate(message, "MessageType") # (4)
if errors.empty?
puts "✓ Valid!" # (5)
else
errors.each { |e| puts " - #{e}" } # (6)
end-
Parse the Proto3 schema
-
Parse binary Protocol Buffer
-
Create validator with schema
-
Validate message
-
Validation passed
-
Show errors if any
Unibuf supports complete round-trip serialization for text format, allowing you to parse, modify, and serialize back while preserving semantic equivalence.
# Parse (text or binary)
message = Unibuf.parse_textproto_file("input.txtpb") # (1)
# Serialize to text format
textproto = message.to_textproto # (2)
File.write("output.txtpb", textproto) # (3)
# Verify round-trip
reparsed = Unibuf.parse_textproto(textproto) # (4)
puts message == reparsed # => true (5)-
Parse the original file
-
Serialize to text format
-
Write to file
-
Parse the serialized output
-
Verify semantic equivalence
Unibuf provides rich domain models with comprehensive behavior.
Over 60 classes provide extensive functionality following object-oriented principles.
# Parse message (text or binary)
schema = Unibuf.parse_schema("schema.proto")
message = Unibuf.parse_binary_file("data.binpb", schema: schema)
# Classification (MECE)
message.nested? # Has nested messages?
message.scalar_only? # Only scalar fields?
message.maps? # Contains maps?
message.repeated_fields? # Has repeated fields?
# Queries
message.find_field("name") # Find by name
message.find_fields("tags") # Find all with name
message.field_names # All field names
message.repeated_field_names # Repeated field names
# Traversal
message.traverse_depth_first { |field| ... }
message.traverse_breadth_first { |field| ... }
message.depth # Maximum nesting depth
# Validation
message.valid? # Check validity
message.validate! # Raise if invalid
message.validation_errors # Get error listComplete CLI toolkit supporting both text and binary Protocol Buffer formats.
Schema is REQUIRED for proper message type identification.
# Parse text format
unibuf parse data.txtpb --schema schema.proto --format json
# Parse binary format
unibuf parse data.binpb --schema schema.proto --format json
# Auto-detect format
unibuf parse data.pb --schema schema.proto --format yaml
# Specify message type
unibuf parse data.binpb --schema schema.proto --message-type FamilyProto# Validate text format
unibuf validate data.txtpb --schema schema.proto
# Validate binary format
unibuf validate data.binpb --schema schema.proto
# Specify message type
unibuf validate data.pb --schema schema.proto --message-type MessageType# Binary to JSON
unibuf convert data.binpb --schema schema.proto --to json
# Binary to text
unibuf convert data.binpb --schema schema.proto --to txtpb
# Text to JSON
unibuf convert data.txtpb --schema schema.proto --to jsonUnibuf
├── Parsers
│ ├── Textproto Text format parser
│ │ ├── Grammar Parslet grammar
│ │ ├── Processor AST transformation
│ │ └── Parser High-level API
│ ├── Proto3 Schema parser
│ │ ├── Grammar Proto3 grammar
│ │ ├── Processor Schema builder
│ │ └── Parser Schema API
│ ├── Binary Binary Protocol Buffers
│ │ └── WireFormatParser Wire format decoder
│ ├── Flatbuffers FlatBuffers parser
│ │ ├── Grammar FBS grammar
│ │ ├── Processor Schema builder
│ │ └── BinaryParser Binary format
│ └── Capnproto Cap'n Proto parser
│ ├── Grammar Cap'n Proto grammar
│ ├── Processor Schema builder
│ ├── SegmentReader Segment management
│ ├── PointerDecoder Pointer decoding
│ ├── StructReader Struct reading
│ ├── ListReader List reading
│ └── BinaryParser Binary format
├── Serializers
│ ├── BinarySerializer Protocol Buffers binary
│ ├── Flatbuffers FlatBuffers binary
│ │ └── BinarySerializer
│ └── Capnproto Cap'n Proto binary
│ ├── SegmentBuilder Segment allocation
│ ├── PointerEncoder Pointer encoding
│ ├── StructWriter Struct writing
│ ├── ListWriter List writing
│ └── BinarySerializer
├── Models
│ ├── Message Protocol Buffer message
│ ├── Field Message field
│ ├── Schema Proto3 schema
│ ├── MessageDefinition Message type definition
│ ├── FieldDefinition Field specification
│ ├── EnumDefinition Enum type definition
│ ├── Flatbuffers FlatBuffers models (6 classes)
│ ├── Capnproto Cap'n Proto models (7 classes)
│ └── Values Value type hierarchy (5 classes)
├── Validators
│ ├── TypeValidator Type and range validation
│ └── SchemaValidator Schema-based validation
└── CLI
└── Commands parse, validate, convert, schemaBug reports and pull requests are welcome at https://github.com/lutaml/unibuf.
Copyright Ribose Inc.
Licensed under the 3-clause BSD License.