Skip to content

Rethinking stateful time APIs to avoid surprising behaviors #3743

@teh-cmc

Description

@teh-cmc

TL;DR: Introduce per-recordingstream clocks, as opposed to per-thread clocks.

Status quo

Logging to a recording is controlled by RecordingStreams.

RecordingStream are Send & Sync, i.e. they can be shallow-cloned (refcounted) and sent to other threads where they can then be used concurrently.
re_sdk provides facilities to help keep track of per-process and per-thread RecordingStreams.
Those facilities are mostly used by the Python bindings, though nothing prevents using them in Rust & C++ too.

Independently of recording streams, we also have per-thread clocks.
These clocks keep track of one time per-Timeline per-RecordingStreamId per-ThreadId: i.e. something along the lines of HashMap<(ThreadId, RecordingStreamId, Timeline), Time>.

These times can be set in a stateful manner using the RecordingStream::set_time_* family of methods.
When it's time to log something, a recording stream will query the clock in thread-local storage and ask for the time on all timelines for the current (ThreadId, RecordingStreamId) pair.

Problems

This implicit relationship between threads and RecordingStreams can lead to confusion and/or surprising behavior.

Example:

  1. You call rec.set_time_sequence("frame", 42) on your main thread.
  2. You send a bunch of data as well as a copy of rec to another thread where the data gets pre-processed then logged to rec.
  3. You get confused: the data doesn't have a "frame" timestamp associated with it (the thread-local clock on your post-processing thread doesn't know about it!).

You can imagine all kinds of scenarios like the above. The root cause is always the same though: thread-local clocks vs. recording streams vs. the interaction between those.

In Rust & C++, you generally always work directly with RecordingStream handles of some kind, so at least the confusion stops there.

But in Python, it can actually gets worse:

  • You generally use the global recording-stream, which is implicitly set when calling rr.init().
  • Although sometimes you may be using a thread-local recording-stream (via the with syntax).
  • And nothing prevents you to just use a recording stream handle, like you'd do in Rust/C++!

Finally, each RecordingStream keeps track of its internal, process-global tick: i.e. the log_tick timeline is always global no matter what!

Proposal

My proposal would be to get rid of thread-local clocks altogether, and instead put a clock in each and every RecordingStream.

Furthermore, clocks would not be refcounted:

#[derive(Clone)]
pub struct RecordingStream {
    clock: Clock,
    inner: Arc<Option<RecordingStreamInner>>,
}

That way, cloning a RecordingStream would still be relatively cheap (and more importantly, wouldn't clone the underlying batching and I/O pipelines) but would now behave more like a fork: the clone recording-stream just starts where its parent state left off.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions