Skip to content

Conversation

@gpauloski
Copy link
Collaborator

@gpauloski gpauloski commented Nov 1, 2024

Description

This is a refactor of the interfaces and protocols in proxystore.stream to support these two high-level features:

  • Implementing shims that have more granular access to stream items.
  • Supporting sending stream items directly in the event structure. I.e., not putting all objects into a Store which is useful for small objects or shims that want to do their own data management.

To achieve these goals, the following changes were made:

  • Publisher and Subscriber are now union types of EventPublisher | MessagePublisher and EventSubscriber | MessageSubscriber, respectively.
    • The StreamProducer and StreamConsumer are compatible with both types of shim protocols.
    • EventPublisher/EventSubscriber send/receive EventBatch objects directly so they can implement more granular control over transfer, serialization, etc. This will be useful for @ValHayot's mofka integration.
    • MessagePublisher/MessageSubscriber are equivalent to the old protocols; they send/receive already serialized byte-string messages. These are easier to implement, and represent the exisitng shims provided in proxystore.stream.shims.
  • There are two kinds of new object events: NewObjectEvent and NewObjectKeyEvent.
    • A NewObjectEvent contains the data and metadata directly in the event.
    • A NewObjectKeyEvent contains the metadata and the key corresponding to an object in a Store.
    • The type of event is based on if the StreamProducer was configured to separate metadata and data using a Store. Topics with an associated Store will produce NewObjectKeyEvent and topics without an associate Store (or default Store) will produce NewObjectEvent.

The following breaking changes were made:

  • The stores parameter of StoreProducer is now keyword-only.
  • Existing Publisher/Subscriber shims will need to update their method signatures (Publisher.send() is now MessagePublisher.send_message() and MessageSubscriber must implement next_message()).

Note that the proxystore.stream module is marked as experimental so breaking changes in patch versions is acceptable.

Fixes N/A

Type of Change

  • Breaking Change (fix or enhancement which changes existing semantics of the public interface)
  • Enhancement (new features or improvements to existing functionality)
  • Bug (fixes for a bug or issue)
  • Internal (refactoring, style changes, testing, optimizations)
  • Documentation update (changes to documentation or examples)
  • Package (dependencies, versions, package metadata)
  • Development (CI workflows, pre-commit, linters, templates)
  • Security (security related changes)

Testing

Expanded unit tests to cover new configurations.

Pull Request Checklist

Please confirm the PR meets the following requirements.

  • Tags added to PR (e.g., breaking, bug, enhancement, internal, documentation, package, development, security).
  • Code changes pass pre-commit (e.g., mypy, ruff, etc.).
  • Tests have been added to show the fix is effective or that the new feature works.
  • New and existing unit tests pass locally with the changes.
  • Docs have been updated and reviewed if relevant.

@gpauloski gpauloski added the breaking Backwards incompatible change to public interfaces label Nov 1, 2024
The StreamProducer can now work without a mapping of topics to Stores.
When a Store is not assigned to a topic, the stream object is sent
directly inside the event. Thus, there are now two new object events:
NewObjectEvent for when the object is included directly, and
NewObjectKeyEvent for when the object is located in a Store.

The StreamProducer/StreamConsumer now support both types of
Publisher/Subscriber pairs, event-based and message-based, depending on
what granularity the shim implementation wants access to.
The _EventInfo became a little confusing with union types after adding
the different kinds of new object events, so I removed that in the
StreamConsumer. This did require moving some metadata from the
EventBatch to the specific event types, but this probably makes more
sense with some metadata only applying to certain types anyways.
@gpauloski gpauloski added the enhancement New features or improvements to existing functionality label Nov 2, 2024
@gpauloski gpauloski marked this pull request as ready for review November 2, 2024 04:36
@gpauloski gpauloski merged commit 34544f5 into main Nov 2, 2024
@gpauloski gpauloski deleted the stream-refactor branch November 2, 2024 04:39
braceal added a commit to ramanathanlab/deepdrivewe that referenced this pull request Oct 29, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

breaking Backwards incompatible change to public interfaces enhancement New features or improvements to existing functionality

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants