bjdata

Binary JData for Python - Lightning-Fast Binary JSON

A high-performance Python encoder and decoder for Binary JData (BJData), the lightweight binary JSON format optimized for scientific and data-intensive applications.

⚡ Blazing Fast

Optimized C extension provides significant speed improvements over pure Python

🔬 Scientific Ready

Native support for NumPy arrays and N-dimensional data structures

📦 Compact Storage

Strongly-typed containers reduce file sizes dramatically

👁️ Quasi-Readable

Semi-human-readable format makes debugging easier than other binary formats

Scroll to explore

Why BJData?

Extended Data Types

BJData extends JSON with strongly-typed binary data, supporting unsigned integers, half-precision floats, and N-dimensional arrays essential for scientific computing.

Feature BJData JSON BSON MessagePack
Human Readable ✅ Quasi ✅ Full
Binary Format
N-D Arrays ✅ Native
Unsigned Types
Max File Size Unlimited Unlimited 4 GB 4 GB

Get Started

Installation

Install PyBJData via pip or your favorite package manager. The C extension is optional but highly recommended for performance.

# Standard installation
pip3 install bjdata
# User installation (no root required)
pip3 install bjdata --user
# Debian/Ubuntu 21.04+
sudo apt-get install python3-bjdata

💡 Performance Tip: Check if the C extension is enabled with bjdata.EXTENSION_ENABLED

Quick Start

Basic Encoding & Decoding

PyBJData behaves like Python's built-in JSON module. Encode and decode complex data structures with simple function calls.

import bjdata as bj

# Create a complex Python object
data = {
    'name': 'experiment_001',
    'temperature': 23.5,
    'readings': [1, 2, 3, 4, 5],
    'metadata': {
        'date': '2025-12-20',
        'valid': True
    }
}

# Encode to BJData binary format
encoded = bj.dumpb(data)
print(f"Encoded size: {len(encoded)} bytes")

# Decode back to Python object
decoded = bj.loadb(encoded)
print(decoded)

Scientific Computing

NumPy Integration

Native support for NumPy arrays with optimized N-dimensional array storage. Arrays are automatically reconstructed on decode.

import bjdata as bj
import numpy as np

# Create NumPy arrays
scalar = np.float32(3.14159)
array_1d = np.array([1, 2, 3, 4, 5], dtype=np.int32)
array_2d = np.random.rand(10, 20)
array_3d = np.zeros((5, 10, 15), dtype=np.uint8)

# Encode with optimized format
data = {
    'scalar': scalar,
    'vector': array_1d,
    'matrix': array_2d,
    'tensor': array_3d
}

encoded = bj.dumpb(data)

# Decode - automatically reconstructs NumPy arrays
decoded = bj.loadb(encoded)
print(type(decoded['matrix']))  # <class 'numpy.ndarray'>

Technical Details

BJData Specification

Based on BJData Spec V1 Draft 3, derived from UBJSON with extensions for scientific computing.

Type Marker Size Range
nullZ1 byte-
true/falseT/F1 byte-
int8i2 bytes-128 to 127
uint8U2 bytes0 to 255
int16I3 bytes-32,768 to 32,767
uint16u3 bytes0 to 65,535
int32l5 bytes-2.1B to 2.1B
uint32m5 bytes0 to 4.3B
int64L9 bytes-9.2E18 to 9.2E18
uint64M9 bytes0 to 1.8E19
float16h3 bytesIEEE 754 half
float32d5 bytesIEEE 754 single
float64D9 bytesIEEE 754 double

⚠️ Endianness: BJData uses little-endian byte order by default (since Draft 2), differing from UBJSON's big-endian format.

Real World

Example Use Cases

Medical imaging, scientific datasets, and high-performance data storage.

import bjdata as bj
import numpy as np

# Medical imaging data
dicom_data = {
    'patient_id': 'P123456',
    'study_date': '2025-12-20',
    'modality': 'MRI',
    'image_data': np.random.randint(0, 4096, 
                                 (512, 512, 128), 
                                 dtype=np.uint16),
    'slice_thickness': 1.5,
    'pixel_spacing': [0.5, 0.5],
    'metadata': {
        'scanner': 'Siemens Skyra 3T',
        'sequence': 'T1-weighted MPRAGE'
    }
}

# Encode efficiently with optimized arrays
encoded = bj.dumpb(dicom_data, container_count=True)
print(f"Size: {len(encoded) / 1024 / 1024:.2f} MB")

# Save to file
with open('scan.bjd', 'wb') as f:
    bj.dump(dicom_data, f, container_count=True)

Learn More

Resources & Documentation

Explore the full documentation, contribute on GitHub, or join the NeuroJSON community.

API Reference

# Core functions
bjdata.dumpb(obj, container_count=False, 
             sort_keys=False, no_float32=True)
bjdata.loadb(chars, no_bytes=False, 
             intern_object_keys=False)
bjdata.dump(obj, fp, ...)
bjdata.load(fp, ...)

Copyright © 2020-2025 Qianqian Fang • Supported by NIH Grant U24-NS124027
Licensed under Apache License 2.0

Powered by Habitat