SDK batching/revamp 1: impl `DataTableBatcher` by teh-cmc · Pull Request #1980 · rerun-io/rerun

teh-cmc · 2023-04-26T16:35:49Z

This PR implements DataTableBatcher, which... batches DataTables.

Not used anywhere yet, just the type itself.

Part of #1619

Support sending a DataCell's size (& other metadata) over the wire #1760

Future work:

crates/re_log_types/src/data_table_batcher.rs

jleibs

Some small comments and maybe a deadlock but the overall structure looks good. Nice standalone clean PR!

jleibs · 2023-04-28T16:09:47Z

crates/re_log_types/src/data_table_batcher.rs

+    pub flush_tick: Duration,
+
+    /// Flush if the accumulated payload has a size in bytes equal or greater than this.
+    pub flush_num_bytes: u64,


There's a subtle distinction between "batch size" and "flush threshold" -- I suspect they are related but it's not entirely explicit here. A bit more explanation could be helpful.

I'm not sure what you're referring to? What's "batch size"? There's no config with that name 🤔

No, but this thing is a "Batcher" and so it produces batches and those batches have a size. In particular the thing I think I was curious about was whether the batch would have a size > flush_num_bytes (I believe the answer to this is yes), but I wanted to be explicit that when we flush, all of the outstanding bytes (including those above the flush threshold) will end up in the batch.

Ah yes, that's what the equal or greater than this alludes to. I'll make it extra clear.

crates/re_log_types/src/data_table_batcher.rs

jleibs · 2023-04-28T16:18:35Z

crates/re_log_types/src/data_table_batcher.rs

+
+        let cmds_to_tables_handle = {
+            const NAME: &str = "DataTableBatcher::cmds_to_tables";
+            std::thread::Builder::new()


Out of scope for this PR but I'm a bit unsure myself how we should be deciding when we use tokio vs when we just spawn threads.

My general motto is: if you can avoid async, avoid async.

jleibs · 2023-04-28T16:30:56Z

crates/re_log_types/src/data_table_batcher.rs

+    fn drop(&mut self) {
+        // NOTE: The command channel is private, if we're here, nothing is currently capable of
+        // sending data down the pipeline.
+        self.tx_cmds.send(Command::Shutdown).ok();


I think we want to drop self.rx_tables before we call this send to avoid a deadlock?

Basically, if we are using a bounded channels and rx_tables isn't being drained by any listeners, then tx_table.send(table) could be blocking the batching thread, preventing it from ever processing the Shutdown. Dropping rx_tables first should at least cause tx_table to either error or it means some other outstanding sender thread is still holding a receiver and then at least it isn't our problem.

Yeah the docs mention this:

/// Shutting down cannot ever block, unless the output channel is bounded and happens to be full
/// (see [DataTableBatcherConfig::max_tables_in_flight]).

My thought process being that if the user deliberately configures the channel sizes to be bounded, then they should expect that the system can and will block at any (inconvenient) time if they don't consume as needed.

Eh, I guess we can be extra polite...

jleibs · 2023-04-28T16:34:07Z

crates/re_log_types/src/data_table_batcher.rs

+
+    // --- Subscribe to tables ---
+
+    /// Returns a _shared_ channel in which are sent the batched [`DataTable`]s.


Any particular reason to make this shared? Is there value in a new receiver jumping in mid-stream? I worry about the usefulness of the data in that case.

The channel is already mpmc by nature, so I didn't see the point of umping through extra hoops to turn it back into an mpsc one. Also parallel consumers might come in handy at some point.

jleibs · 2023-04-28T16:34:58Z

crates/re_log_types/src/data_table_batcher.rs

+    ) -> bool {
+        // TODO(#1760): now that we're re doing this here, it really is a massive waste not to send
+        // it over the wire...
+        row.compute_all_size_bytes();


How expensive is this? Wasn't this a bottle-neck before?

It was and still is and that's what's so nice about it: it's now done in a background thread on the clients...

But now we still need to send that information to the server to make things really fantastic.

it's now done in a background thread on the clients

We should be careful about that though... we don't want to make this a bottleneck or a CPU drain on the clients either.

It really isn't: it's extremely costly compared to the rest of the operations that we do on the very fast paths in the store, but it's still order of magnitude faster that most of what goes on on the client... especially if it's a python client 😒

jleibs · 2023-04-28T16:37:52Z

crates/re_log_types/src/data_table_batcher.rs

+        acc.pending_num_rows >= config.flush_num_rows
+            || acc.pending_num_bytes >= config.flush_num_bytes


Rather than returning a bool it seems like it would be clearer to call do_flush_all here.

I find it clearer to be able to see all flush triggers in the main loop

In that case why not move the check into the main loop as well?

do_push_row(&mut acc, row); if acc.pending_num_rows >= config.flush_num_rows || acc.pending_num_bytes >= config.flush_num_bytes { do_flush_all(&mut acc, &tx_table, "bytes|rows"); acc.reset(); }

But, at the very least add a comment to do_push_row indicating that it returns a bool with the assumption that the caller will flush the data if it returns true?

Moving the check sounds good to me 👍

crates/re_log_types/src/data_table_batcher.rs

teh-cmc added 🏹 arrow Apache Arrow 📉 performance Optimization, memory use, etc labels Apr 26, 2023

teh-cmc commented Apr 26, 2023

View reviewed changes

crates/re_log_types/src/data_table_batcher.rs Outdated Show resolved Hide resolved

crates/re_log_types/src/data_table_batcher.rs Outdated Show resolved Hide resolved

teh-cmc force-pushed the cmc/sdk_revamp/1_batcher branch from c7dfb72 to 8b91b65 Compare April 26, 2023 21:07

teh-cmc mentioned this pull request Apr 26, 2023

Replace Session with RecordingContext #1983

Merged

2 tasks

teh-cmc added the do-not-merge Do not merge this PR label Apr 26, 2023

teh-cmc added 10 commits April 27, 2023 11:11

version crossbeam at the workspace level

68d8b0c

more DataRow size helpers

cb74038

DataTableBatcher

a0d9d39

lints

f46ac72

lints

5440f76

self review

c1088c5

don't expose shutdown to make errors impossible

cbf17be

doc

e7b42bf

backport

573de98

backport

67dc616

teh-cmc force-pushed the cmc/sdk_revamp/1_batcher branch from eba29b7 to 67dc616 Compare April 27, 2023 09:11

teh-cmc mentioned this pull request Apr 27, 2023

SDK batching/revamp 3: sunset PythonSession #1985

Merged

6 tasks

teh-cmc marked this pull request as ready for review April 27, 2023 16:09

teh-cmc removed the do-not-merge Do not merge this PR label Apr 27, 2023

Merge branch 'main' into cmc/sdk_revamp/1_batcher

ead5883

jleibs approved these changes Apr 28, 2023

View reviewed changes

teh-cmc added 3 commits May 3, 2023 10:30

Merge branch 'main' into cmc/sdk_revamp/1_batcher

a31285b

Merge branch 'main' into cmc/sdk_revamp/1_batcher

96f29b4

address PR comments

53540fa

teh-cmc merged commit 85b46ca into main May 3, 2023

teh-cmc deleted the cmc/sdk_revamp/1_batcher branch May 3, 2023 20:11


		// --- Subscribe to tables ---

		/// Returns a _shared_ channel in which are sent the batched [`DataTable`]s.

		acc.pending_num_rows >= config.flush_num_rows
		\|\| acc.pending_num_bytes >= config.flush_num_bytes

Conversation

teh-cmc commented Apr 26, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jleibs left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jleibs Apr 28, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

teh-cmc commented Apr 26, 2023 •

edited

Loading

jleibs Apr 28, 2023 •

edited

Loading