Skip to content

Suggestion: Vttablet: on demand lag throttler heartbeats to reduce binary log volume #10196

@shlomi-noach

Description

@shlomi-noach

The main reason why the tablet throttler is still disabled by default is that it requires heartbeats to operate (--enable_heartbeat). To get good throttling resolution, heartbeats are injected in sub-second interval.

The problem is that heartbeats bloat the binary logs. We are aware of multiple customers who are not enabling heartbeats for this reason.

An idea, originated by an internal discussion with @aquarapid, is to have "lazy", or "on demand" heartbeats. @rohit-nayak-ps and myself discussed the idea and came up with what we believe is a usable solution.

We suggest the idea of on-demand heartbeats, such that the tablet does not actually write any heartbeats, unless requested to. An internal module such as the throttler could reach out to the tablet and announce "I need heartbeats". The tablet will produce heartbeats at the configured --heartbeat_interval, for a limited duration. That duration is to be controlled a by a new command line flag.

So heartbeats will be on a lease. Should new requests come in while the lease is ongoing, then the lease is extended. When no more requests for heartbeats come, the lease eventually expires and the tablet stops writing heartbeats.

What this will enable us is to only generate heartbeats when the throttler needs them. Specifically: whenever there's an active vreplication workflow or otherwise a Online DDL operation in progress.

Take Online DDL as an example. The operation generates a massive amount of writes (duplicating a table); compared with that, heartbeats are negligible, and so enabling them during the migration is relatively cheap and then also provides great value. When the migration terminates, heartbeats are no longer strictly required and become expensive, so we turn them off.

This is a "cold engine" scenario. The first time the throttler is asked to operate, heartbeats will be stale and the throttler will push back. but then, it sends a request for heartbeats, and within a short time we can expect the mechanism to "warm up" and begin serving.

PR incoming.

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions