Skip to content

Conversation

@chaokunyang
Copy link
Collaborator

@chaokunyang chaokunyang commented Dec 25, 2025

Why?

This PR adds field-level metadata configuration support for Python, completing the cross-language field metadata feature across all Fory language implementations (Java, Rust, Go, C++, and now Python).

Field metadata allows users to:

  • Use numeric tag IDs instead of field names for more compact serialization
  • Control nullable flags, reference tracking, and field ignoring per field
  • Enable schema evolution with stable field identifiers

What does this PR do?

1. New pyfory.field() API

Introduces a new pyfory.field() function for fine-grained control over serialization behavior per field:

from dataclasses import dataclass
from typing import Optional, List
import pyfory
from pyfory import Fory, int32

@dataclass
class User:
    id: int32 = pyfory.field(0)                          # Tag ID 0
    name: str = pyfory.field(1)                          # Tag ID 1
    email: Optional[str] = pyfory.field(2, nullable=True) # Tag ID 2, nullable
    friends: List["User"] = pyfory.field(3, ref=True, default_factory=list)  # Tag ID 3, ref tracking
    _cache: dict = pyfory.field(-1, ignore=True, default_factory=dict)  # Field name encoding, ignored

2. TAG_ID Encoding Support

Implements TAG_ID encoding in the xlang serialization protocol:

  • id >= 0: Uses numeric tag ID (2-bit encoding = 0b11)
  • id = -1: Uses field name with meta string encoding
  • More compact than field name encoding
  • Stable across field renames

3. Field Header Format

Field header (8 bits):
- 2 bits: encoding type (0b00-10 = field name, 0b11 = TAG_ID)
- 4 bits: size/tag_id (0-14 inline, 15 = overflow)
- 1 bit: nullable flag
- 1 bit: ref tracking flag

4. Schema Evolution Support

When deserializing data with a different schema than the registered class:

  • TypeDef meta contains the sender's field information
  • TAG_ID is resolved back to actual field names using the receiver's class metadata
  • Enables forward/backward compatibility

5. Files Changed

New files:

  • python/pyfory/field.py: pyfory.field() function and ForyFieldMeta dataclass

Modified files:

  • python/pyfory/struct.py: Updated DataClassSerializer to support field metadata
  • python/pyfory/meta/typedef.py: Added TAG_ID to field name resolution
  • python/pyfory/meta/typedef_encoder.py: TAG_ID encoding support
  • python/pyfory/meta/typedef_decoder.py: TAG_ID decoding support
  • python/pyfory/__init__.py: Export field function

Test files:

  • python/pyfory/tests/test_field_meta.py: Comprehensive tests for field metadata

Related issues

#3002

Does this PR introduce any user-facing change?

Yes, introduces the new pyfory.field() API for field-level metadata configuration.

  • Does this PR introduce any public API change?
  • Does this PR introduce any binary protocol compatibility change?

The binary protocol changes (TAG_ID encoding) are already part of the xlang specification and implemented in other languages.

Benchmark

No performance regression expected. TAG_ID encoding is more compact than field name encoding.

@chaokunyang chaokunyang force-pushed the add_python_field_meta_support branch from 48c588f to 0cfac8c Compare December 25, 2025 09:22
@chaokunyang chaokunyang force-pushed the add_python_field_meta_support branch from 0cfac8c to 8134a43 Compare December 25, 2025 09:32
Add `from __future__ import annotations` to field.py and struct.py
to support Python 3.9+ type hints syntax (dict[str, ...], list[...])
on Python 3.8.
…orts

Only export `field` and `ForyFieldMeta` as public API. Internal functions
like `extract_field_meta`, `validate_field_metas`, and `FORY_FIELD_METADATA_KEY`
should be accessed via `pyfory.field` module directly if needed.
ForyFieldMeta is an internal implementation detail.
@chaokunyang chaokunyang merged commit 2c86a77 into apache:main Dec 25, 2025
56 checks passed
@chaokunyang chaokunyang mentioned this pull request Dec 25, 2025
17 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants