Session event log sketch by arminsabouri · Pull Request #675 · payjoin/rust-payjoin

arminsabouri · 2025-04-30T13:54:33Z

This PR presents a rough sketch of the session event log architecture that @nothingmuch and I have been working on. Its primary purpose is to provide visibility into the direction we're taking with this design. I’d recommend against nitpicks or formatting comments at this stage. Architectural feedback is the most valuable form of review.

Note: This PR is not intended to be merged. It will be closed, and a follow-up PR will be opened with a clean commit history. The current branch history has diverged beyond repair, so the plan is to migrate functional chunks of code into a new branch. At present, this branch contains a working implementation of the session event log with replay support for the v2 receiver. We're currently focused on adding UniFFI bindings.

DanGould · 2025-05-07T21:45:47Z

This change is epic.

From my read there are a few major architectural changes to how both sender and receiver persist events, which affects how both Sender and Receiver proceed through their operation. Here's my understanding of what changed:

Serialize all the things

Many, many structs become serializable so that they may be persisted.

Idiomatic, generic `Receiver<State, Persister>` typestate pattern

Receiver becomes generic over its state and session persister implementation. This lets each state share a common process_event function so that events can be replayed in a loop from persistence logs to recover the latest state.

`ReceiverState` enum introduced

Each ReceiverState enum variant encapsulates an instance of impl Receiver<State, P> for some concrete State, because rust can't pattern match on generic type parameters.

`ReceiverSessionEvent` enum introduced

Each ReceiverSessionEvent enum variant encapsulates serialized data necessary to make a state transition. The encapsulation allows that common ReceiverState::process_event(event: ReceiverSessionEvent, persister) trait function to apply the next state in a loop that reads from the logs.

`Token` concept removed

This change allows persistence to happen inside the state transitions without an explicit save/load that we were using before.

Instead, each session itself has a persister (PersistedSession as of writing) implementation itself with arbitrary load functionaly. This may address some of the confusion around how the reference implementation was written vs the needs of persisting session data as in @spacebear21's #689.

PersistableError introduced

Somehow, some errors get recorded to show what something failed by calling PersistedSession::record_error. This part seems the least fleshed out and I don't have much insight into design decisions yet. I'm a bit confused as to why an error wouldn't also be an Event. Is there a good reason for that or is this part still underdesigned?

Please correct any misunderstanding of my read so far

nothingmuch · 2025-05-08T21:48:29Z

ReceiverState enum introduced

Each ReceiverState enum variant encapsulates an instance of impl Receiver<State, P> for some concrete State, because rust can't pattern match on generic type parameters.

the reason is more fundamental, an operation like replay log can't have a static return type that is just one of the states, so we need an enum to represent the combination of all of these possibilities: given a persisted state in the database it might be in any one state so we need an enum to express the return type of replay ranging over all possible typestates (technically this could return a trait object that you then downcast using as_any<Receiver<ExpectedState>> but yuck).

ReceiverSessionEvent enum introduced

Each ReceiverSessionEvent enum variant encapsulates serialized data necessary to make a state transition. The encapsulation allows that common ReceiverState::process_event(event: ReceiverSessionEvent, persister) trait function to apply the next state in a loop that reads from the logs.

similar logic as before re deserialization, unless you know what event type is coming next we need an enum type. in principle since the state transitions are known and the state machine has a linear structure except for failures (i.e. no branching structure), but the entire rationale for immutable events in a log is that it's easier to extend in the future with more event types, which would introduce variants into event logs recorded by different versions.

in the future, the history replay query accumulator typestate thingy need to be aware of all of those different versions and the sequences of events they produce (which btw, we should record some kind of version number in the first event i guess?), or at the very least have a procedure for converting legacy logs into current version logs by introducing dummy events corresponding to a new schema. we could commit to having all of that be statically typed, with the event log needing to contain a statically typed sequence of events (each of a different type which corresponds to a specific transitions between pairs of typestate types), but i don't think that buys us anything in terms of safety/stability, using a non exhaustive enum for the different types of events seems easier even for just the history accumulator, the simplest version of that would just scan the log for an event of the right type and return Some(the value), or None if no event was found in the log.

Token concept removed

This change allows persistence to happen inside the state transitions without an explicit save/load that we were using before.
...

PersistableError introduced

Somehow, some errors get recorded to show what something failed by calling PersistedSession::record_error. This part seems the least fleshed out and I don't have much insight into design decisions yet. I'm a bit confused as to why an error wouldn't also be an Event. Is there a good reason for that or is this part still underdesigned?

Please correct any misunderstanding of my read so far

We've discussed retaining this concept, but with changes in order to preserve the lack of side effects in the payjoin crate, which is a useful property.

@0xBEEFCAF3 here's a pseudocode sketch based on our call:

/// a token is now a specialized Result type that gives back
/// the receiver that the method consumed, but only if unwrapped
/// by a persister.save()
type Token<E : Event, N : State, C : State, R : StateError> = Result<Accepted<E : Event , N>,Rejected<E, R, C>>;

/// The Ok branch always contains the Accepted tuple.
/// This contains 1 or more (maybe exactly one?) events
/// and the Receiver with the state transition applied to it
/// as private fields 
struct Accepted<E, S : State>(Vec<E>, Receiver<S>)

/// The Err branch always contains the Rejected triple.
/// This is the same as Accepted except it contains an error.
/// The receiver is kept in the current state.
struct Rejected<E, R: Error, S : State>(Vec<E>, R, Receiver<S>)


trait StateError : Error {
    is_terminal() -> bool,
}

trait State {
   is_terminal() -> bool
}

impl Receiver<WithContext> {
  ...
  pub process_res(
    self, // consume self, not &mut self as it currently is
    body: &[u8],
    context: ohttp::ClientResponse,
  ) -> Token<ReceiverEvents, UncheckedProposal, WithContext, ProcessResError> {
    // decapusulate etc as per inner_process_res
    //
    // any error condition that is fatal can record information
    // with events
    //
    // Err(Rejected(...)) with is_terminal() implicitly closes
    // Same for Ok(Accepted()) and the terminal Receiver typestate
    // close() should still be available, but renamed abort()?

    // in the happy path, we accumulate some events
    let events : Vec<ReceiverEvent> = ...
    ...

    // and finally, consume self and the event buffer into an Accept token
    Ok(Accept(events, self))
  }

When calling process_res, you would then need to give this specialized result type to persister.save<E, N, C, R>() and it could unwrap it for you, returning Result<Receiver<N>, (R, Receiver<C>)>. An enum would be necessary for the FFI friendly, non generic version of save().

edit to add: thinking about it more, i think the nested Error in Reject makes more sense as part of the last event.

Instead, each session itself has a persister (PersistedSession as of writing) implementation itself with arbitrary load functionaly. This may address some of the confusion around how the reference implementation was written vs the needs of persisting session data as in @spacebear21's #689.

Consensus in the call today was that it mostly does, in particular the app choosing the ID for each session allows it to be used as a foreign key in whatever schema it has, so any additional data can be associated with a particular session.

arminsabouri · 2025-05-12T13:58:46Z

last push includes a sender type state pattern similar to the receiver type state design.
The possible type states include:

pub enum SenderState<P> {
    Uninitialized(),
    WithReplyKey(Sender<SenderWithReplyKey, P>),
    GetContext(Sender<V2GetContext, P>),
    ProposalReceived(Sender<ProposalReceived, P>),
}

Application devs will still interact with the senderbuilder to construct a Sender with reply key however when replaying the logs internally we use a uninit sender and not the sender builder.
One notable change which I would like concept ack/nack feedback on is I coalesced the process_res and extract_req type states on the v2 sender. Meaning when the sender is in state WithReplyKey they can extract req but that does not move the sender to the next state. And it does not persist anything new b/c we didnt learn anything new. Sender does get a POST context but I believe that is deterministic. Only after process_res'ing does the sender persist the orignal proposal to dir, persist the session event and move onto. This mimicks the same pattern we have in receiver.

Additionally I added a type state ProposalReceived which is only attainable when we process_res the payjoin proposal. It marks the success state for the sender and wraps around the payjoin PSBT.

DanGould

One notable change which I would like concept ack/nack feedback on is I coalesced the process_res and extract_req type states on the v2 sender. Meaning when the sender is in state WithReplyKey they can extract req but that does not move the sender to the next state.

I.e. sender state was created with URI, original PSBT, and reply key.

And [Sender<SenderWithReplyKey, P>::extract_*] does not persist anything new b/c [it] didnt learn anything [worth persisting].

Although ohttp_relay is passed to Sender<...>::extract_v2, it doesn't need to be persisted. Got it. Assuming new persistence is the only reason to advance state, this function does not advance state. Makes sense.

[`Sender::extract_v2] does [return] a [V2PostContext] but I believe that is deterministic.

I don't believe it it is deterministic because encrypt_message_a calls hpke::setup_sender, where it calls Kem::encap, which depends on csprng. Similarly, OHTTP encapsulation is not deterministic otherwise 2 requests encapsulating the same body would be linkable.

Only after process_res'ing does the sender persist the orignal proposal to dir, persist the session event and move onto.

So the original PSBT is not persisted after the builder builds? Or the proposal PSBT is only persisted after it's received?

This mimicks the same pattern we have in receiver.

Is mimicking the pattern the rationale for the change? The rationale for state transition upon v0.23.0 Sender::extract_req was that only 1 request would ever need to be extracted, and there was indeed a new state that needed to wait for a response. The sender's request would not poll. However, if the request failed, or if there were no network access before process_res could be called, I could see wanting to be able to extract another request instead of failing the session or persisting that request for replay (which would break unlinkability between OHTTP requests)

Is there some other rationale relating to persistence being a requirement for state transition proposed here? I'm not sure I have enough information to cack/nack, since I'm not sure exactly what you mean by "I coalesced the process_res and extract_req type states" or what the rationale for doing so is.

DanGould · 2025-05-12T16:08:01Z

payjoin-cli/src/app/v2.rs

+                    match self.post_orginal_proposal(&context).await {
+                        Ok(sender) => session = SenderState::V2GetContext(sender),
+                        Err(_) => {
+                            let (req, v1_ctx) = context.extract_v1();
+                            let response = post_request(req).await?;
+                            let psbt = Arc::new(v1_ctx.process_response(
+                                &mut response.bytes().await?.to_vec().as_slice(),
+                            )?);
+                            self.process_pj_response((*psbt).clone())?;
+                            return Ok(());
+                        }


It seems odd to me that post_original_proposal calls post_request, as does the Err(_) match arm, which also loop. It seems like if this initial post fails the payjoin should fail by the error arm returning an error, not returning Ok.

arminsabouri · 2025-05-13T11:50:13Z

So the original PSBT is not persisted after the builder builds? Or the proposal PSBT is only persisted after it's received?

Both are persisted under different session events.

Is there some other rationale relating to persistence being a requirement for state transition proposed here?

yes two reasons above:

"And it does not persist anything new b/c we didnt learn anything new." Meanings, In we only perist events when the type state learns a new peice of information (session hpke keys), non-deterministic (same example as last) and there are no adverse side affects from calling the prev type state.
creating similar patterns between the sender & receiver.

I was under the impression that extract_req will always get you the same req & ohttp_ctx and there are no side affects to looping and calling extract_req. But this is not the case. So I will revert back to the original type state.

was that only 1 request would ever need to be extracted, and there was indeed a new state that needed to wait for a response. The sender's request would not poll. However, if the request failed, or if there were no network access before process_res could be called, I could see wanting to be able to extract another request instead of failing the session or persisting that request for replay (which would break unlinkability between OHTTP requests)

Ok this reason + randomness in the KEM is convincing enough to keep the existing type states. One problem is that POSTContext includes a foreign type which does not statisfy the trait bounds. We'll have to wrap it and if that doenst work make a upstream change.

arminsabouri · 2025-05-14T12:46:24Z

The latest push reintroduces storage tokens, now referred to as state transition objects. The key change is that instead of persisting session events directly within the type state during a transition function (e.g., process_res), the transition function returns a state transition object. This object is then evaluated by the persister, which handles persisting the associated session event and returning the next state.

This approach introduces some additional boilerplate for application developers but enforces a clean separation between state logic and I/O operations by moving persistence concerns outside the type state.

State transition objects are grouped into six categories. Naming is provisional and we can bike shed later.

// A transition that can be either fatal or transient or have no results.
/// If success it must have a next state and an event to save
pub type MaybeFatalTransitionWithNoResults<Event, NextState, CurrentState, Err> =
    MaybeFatalStateTransitionResultWithNoResults<Event, NextState, CurrentState, Err>;

/// A transition that can be either fatal or transient.
/// If success it must have a next state and an event to save
/// If Fatal error we must save the event and close the session
pub type MaybeFatalTransition<Event, NextState, Err> =
    MaybeFatalStateTransitionResult<Event, NextState, Err>;

/// A transition that can be transient.
/// If success it must have a next state and an event to save
pub type MaybeTransientTransition<Event, NextState, Err> =
    Result<AcceptNextState<Event, NextState>, RejectTransient<Err>>;

/// A transition that can be success or transient error
/// If success there are no events to save or next state
/// If transient error we can retry this state transition
pub type MaybeSuccessTransition<Err> = Result<AcceptCompleted, RejectTransient<Err>>;

/// A transition that is always a next state transition
pub type NextStateTransition<Event, NextState> = AcceptNextState<Event, NextState>;

/// A transition that can be success or bad init inputs
/// This is a special case where the state machine init inputs are bad and we can't proceed
/// The only thing we can do is reject the session. Since the session doesnt really exist at this point
/// there is no need to save events or close the session
pub type MaybeBadInitInputsTransition<Event, NextState, Err> =
    Result<AcceptNextState<Event, NextState>, RejectBadInitInputs<Err>>;

Each transition type captures a specific set of possible outcomes that a state transition function may return. These include:

The session is completed and closed.
The session progresses to the next state, and the event must be saved before the new type state is obtained.
A transient error occurs; the session is not closed and the state does not change. A retry is expected.
A fatal error occurs; the session is closed and the event must be persisted.
The state machine initialization fails due to invalid input. The session is rejected without saving any event or transitioning state.

From the application developer’s perspective, the only implementation requirement remains the save_event function on the persister. Each persister function is now statically typed to its associated transition object, abstracting the underlying details of how each transition is persisted.

A notable detail is that transition types which include fatal outcomes are not wrapped in Result. This is a deliberate choice to prevent misuse such as blindly propagating (?) or unwrapping transitions that require events to be saved.

The remaining work before this PR can be broken into smaller, reviewable units includes:

Refactoring all state transition functions to consume self.
Applying the same state transition abstraction model to the sender.
Resolving potential asymmetries between the sender and receiver state machines.

arminsabouri · 2025-05-15T15:35:09Z

One hazy idea: in general its a tough to know what persister method you need to call for what state transition type alias. Two solutions

Either all state transition methods are structs and the persister save methods are named accordingly. E.g create_session give you a MaybeBadInitInputs struct which wraps a Result type and the persister has a method save_maybe_bad_init_inputs
Or each state transition method returns a struct with save() method which consumes a ref persister which then give you your next type state

cc @nothingmuch

nothingmuch · 2025-05-15T15:52:02Z

Or each state transition method returns a struct with save() method which consumes a ref persister which then give you your next type state

i think i prefer the simplicity of the 2nd approach

supporting both async and sync persister traits like we discussed would be perhaps a bit messier, it might be possible to do this without duplication based on a type parameter or associated type of the persister, which for async will be Future<Result<...>>, and in the other just Result<...>, i.e. a higher kinded type but i don't yet know how to do that in rust brb

that said even without HKT the 2nd approach seems like it'd have the least amount of boilerplate, while still emphasizing strong typing where we want it (i.e. we can't get away with a trait on all transitions because the whole point of splitting these is to simplify the return types depending on the nature of the transition)

arminsabouri · 2025-05-16T03:24:04Z

The latest push implement approach two here

Much happier with the reduction in boiler plate. Example:

let proposal = proposal
            .check_broadcast_suitability(None, |tx| Ok(wallet.can_broadcast(tx)?))
            .save(persister);

…. test

arminsabouri · 2025-05-19T20:18:12Z

Rebased on master (as of this morning. I missed a couple commits).
Fixed the int. tests.
Split the persister session trait into two. A publicly accessible trait with methods to implement (save_event, load, and close) and an internal trait that extends the public one. The internal trait pub(crate)'s the methods that handle specific state transition objects.
Reduces api side boilerplace when returning state transition objects by proving util methods for expressing the specific state transition.
e.g

- return MaybeFatalStateTransitionResult::Err(
-     MaybeFatalRejection::Transient(RejectTransient(
-           Error::ReplyToSender(e),
-       )),
-     );
+ return ReplyableError::Implementation(_) => MaybeFatalTransition::transient(Error::ReplyToSender(e))

…ce api-side boilerplate

arminsabouri · 2025-05-21T15:33:28Z

Last push includes:

Unit testing for each state transition objects and its respective persister save method.
Further api-side boilerplate reduction.
Handling replay session event log related errors
Handling expired sessions when replaying the session

… object

arminsabouri · 2025-06-20T17:33:35Z

This PR has served its purpose. Most changes have been applied to master.
Related tracking item: #785

DanGould reviewed May 12, 2025

View reviewed changes

arminsabouri added 21 commits May 19, 2025 11:36

WIP

206bb6c

WIP - applying process responses to each type state

b2c702c

WIP - persist each relevant session event for recv

f81e9d1

WIP - replay receiver checkpoint

b8cd7ab

WIP - replay receiver in payjoin cli

18e36b7

WIP - remove unused failure states

e354716

WIP - clean up code and make more idomatic

7e70c1b

WIP - clean up unused traits

aa52a57

WIP - more todos

22c2a91

WIP - remove global receiver

a2b69a2

WIP - load should return box dyn iter

38b16c2

WIP - debugging ignore this commit

3b90090

WIP - idomatic receiver generic over its state

646812e

WIP - more session history methods

927b2fd

WIP - idomatic reciever clean up

ad56924

WIP - Convert sender to typestate pattern

a1de5c5

WIP - session history should be result of replaying logs

dd697df

WIP - remove unused state trait

bd3589c

WIP - close recv session if session is expired

67618d3

WIP - state transitions

b9265de

WIP - reduce recv process session boilerplate

a295c7e

arminsabouri added 2 commits May 19, 2025 15:23

HACK: placeholder change to make multiparty sender to build

a693868

WIP - use new persister semantics for saving state transitions in int…

3c964d1

…. test

arminsabouri force-pushed the session-event-log-sketch branch from bef5c6b to 3c964d1 Compare May 19, 2025 20:01

arminsabouri added 8 commits May 20, 2025 09:45

WIP - handle expired session during session replay

8e79253

WIP - move session mgmt and histroy to submod

d24edcd

WIP - fix sender/recv unit tests

5fbb1b5

WIP - remove unused persisted session event trait

9d95ec8

WIP - roundtrip ser/deser unit test for recv session events

cf82284

WIP - remove unwraps during session replays

a63687b

WIP - testframework and unit tests for state transitions

e72d89e

WIP - improve visibility of internal persister types and further redu…

8c768eb

…ce api-side boilerplate

arminsabouri added 3 commits May 21, 2025 14:24

WIP - unit tests for state transition util methods

c90c03d

WIP - finalize typestate of ns1r sender should use state transition obj

557904f

WIP - mutliparty receiver idomatic typestate refactor

4d4ec6c

arminsabouri mentioned this pull request May 22, 2025

Add SessionPersister trait #716

Merged

arminsabouri added 7 commits May 22, 2025 15:49

WIP - backport over changes from payjoin#716

fca1ffd

Backport changes from d70d64e

5aee7cb

WIP - expose proposed and og psbt off Sender Session histroy object

5c1ff2b

WIP - expose proposed and og psbt off receiver Session histroy object

362defb

WIP - remove duplicate recv logs

4b711b7

WIP - only contribute one utxo as receiver

df2404b

WIP - expose payjoin psbt with contributed inputs via session history…

14020ab

… object

arminsabouri force-pushed the session-event-log-sketch branch from a3da5ac to 14020ab Compare May 31, 2025 13:27

arminsabouri added 2 commits June 2, 2025 10:42

WIP - session replay and history unit tests for receiver

3f21039

WIP - receiver should save reply key

d6a2869

spacebear21 mentioned this pull request Jun 4, 2025

feat: add non-persisted Async Payjoin support bitcoindevkit/bdk-cli#200

Merged

7 tasks

arminsabouri closed this Jun 20, 2025

Conversation

arminsabouri commented Apr 30, 2025

Uh oh!

DanGould commented May 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Serialize all the things

Idiomatic, generic Receiver<State, Persister> typestate pattern

ReceiverState enum introduced

ReceiverSessionEvent enum introduced

Token concept removed

PersistableError introduced

Uh oh!

nothingmuch commented May 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

ReceiverState enum introduced

ReceiverSessionEvent enum introduced

Token concept removed

PersistableError introduced

Uh oh!

arminsabouri commented May 12, 2025

Uh oh!

DanGould left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

DanGould May 12, 2025

Choose a reason for hiding this comment

Uh oh!

arminsabouri commented May 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

arminsabouri commented May 14, 2025

Uh oh!

arminsabouri commented May 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

nothingmuch commented May 15, 2025

Uh oh!

arminsabouri commented May 16, 2025

Uh oh!

arminsabouri commented May 19, 2025

Uh oh!

arminsabouri commented May 21, 2025

Uh oh!

arminsabouri commented Jun 20, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

DanGould commented May 7, 2025 •

edited

Loading

Idiomatic, generic `Receiver<State, Persister>` typestate pattern

`ReceiverState` enum introduced

`ReceiverSessionEvent` enum introduced

`Token` concept removed

nothingmuch commented May 8, 2025 •

edited

Loading

`ReceiverState` enum introduced

`ReceiverSessionEvent` enum introduced

`Token` concept removed

DanGould left a comment •

edited

Loading

arminsabouri commented May 13, 2025 •

edited

Loading

arminsabouri commented May 15, 2025 •

edited

Loading