Skip to content

Conversation

@gruuya
Copy link
Contributor

@gruuya gruuya commented Jun 21, 2024

Try to flush on a regular interval, so as not to rely on sync calls to do the pushing.

Also:

  • make enqueue_sync non-async by decoupling table creation and flushing from it
  • make table creation lazy (done during actual flushing)
  • explicit flush method is to be used by the handler

@gruuya gruuya requested a review from mildbyte June 21, 2024 06:31
Duration::from_secs(context.config.misc.sync_conf.write_lock_timeout_s);
let sync_writer = Arc::new(RwLock::new(SeafowlDataSyncWriter::new(context.clone())));
let handler = SeafowlFlightHandler::new(context, sync_writer.clone());
tokio::spawn(flush_task(flush_interval, lock_timeout, sync_writer));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to worry about properly terminating this when Seafowl ends / restarting if it crashes / should this use the subsystem setup from the top level that Seafowl uses to run the GC task and other frontends?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps? I'd keep it as is for now and potentially add it as a subsystem later on (needs additional work for integration tests setup).

batches,
)?;

sync_writer.flush().await?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's still the case that this flush() op won't return until we've had a chance to see if we need to flush and wait for the flush to finish, right? (i.e. Seafowl will never end up using more memory than the limit)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The flush call there checks the criteria and while it is met it flushes table after table. The memory usage can only go down inside/after the call.

@gruuya gruuya merged commit 0cc2690 into main Jun 21, 2024
@gruuya gruuya deleted the flush-task branch June 21, 2024 09:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants