Skip to content

Steady state replication throttling POC#3

Open
harrylin98 wants to merge 6 commits into
unstablefrom
rate_limit_poc
Open

Steady state replication throttling POC#3
harrylin98 wants to merge 6 commits into
unstablefrom
rate_limit_poc

Conversation

@harrylin98

Copy link
Copy Markdown
Owner

No description provided.

Signed-off-by: harrylin98 <harrylin980107@gmail.com>
Signed-off-by: harrylin98 <harrylin980107@gmail.com>
harrylin98 and others added 2 commits June 23, 2026 16:25
…ths (valkey-io#3600)

The `pending_command` flag indicates that a client has a fully parsed command ready for execution. This update ensures that the flag is set/cleared consistently across different execution paths.

---------

Signed-off-by: harrylin98 <harrylin980107@gmail.com>
Signed-off-by: harrylin98 <harrylin980107@gmail.com>
@harrylin98 harrylin98 force-pushed the rate_limit_poc branch 4 times, most recently from 430a560 to 72ebc75 Compare June 24, 2026 04:54
Signed-off-by: harrylin98 <harrylin980107@gmail.com>
Signed-off-by: harrylin98 <harrylin980107@gmail.com>

@JimB123 JimB123 left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I focused on the APIs. Big comments:

  • APIs (.h files) need solid documentation
  • Avoid pulling in Amazon overdesign
  • Consider each thing in the .h file. Ask does this need to be part of the API? Why?

Comment thread src/config.c
createBoolConfig("repl-mptcp", NULL, IMMUTABLE_CONFIG, server.repl_mptcp, 0, isValidMptcp, NULL),
createBoolConfig("repl-diskless-sync", NULL, DEBUG_CONFIG | MODIFIABLE_CONFIG, server.repl_diskless_sync, 1, NULL, NULL),
createBoolConfig("dual-channel-replication-enabled", NULL, DEBUG_CONFIG | MODIFIABLE_CONFIG, server.dual_channel_replication, 0, NULL, NULL),
createBoolConfig("repl-throttle", NULL, MODIFIABLE_CONFIG, server.repl_throttle, 0, NULL, NULL),

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need a few people to think about naming. Specifically, this throttler is for steady-state replication only.

Maybe this is ok. At Amazon, we have fine grained control, with different throttlers configured for different phases of full sync and steady-state.

Comment thread src/server.h
Comment on lines +2393 to +2395
/* Throttling */
long long total_throttled_commands; /* Total commands deferred by the throttle framework */
int repl_throttle; /* Enable replication throttle */

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The throttling unit can own these. No need to add to server.

Also, repl_throttle - check naming. This is a config value. Probably want a name like repl_throttle_enabled.

I'd recommend:

  • total_throttled_commands is a metric related to the framework. I'd add this to throttle.h
struct throttle_metrics {
    long long total_throttled_commands;
};
extern struct throttle_metrics throttle_metrics; // the actual struct is defined in the .c file
  • repl_throttle is a config and is specific to replication throttling. I'd add this to throttle_repl.h
struct throttle_repl_config {
    int throttle_repl_enabled;
};
extern struct throttle_repl_config throttle_repl_config;

Comment thread src/throttle.h
@@ -0,0 +1,46 @@
#ifndef THROTTLE_H

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add copyright above

Comment thread src/throttle.h
long oldest_client_delay_us;
} throttleMetrics;

/* Public API */

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The public API should be fully documented.

Comment thread src/throttle.h

void throttle_deregister(int id);

void *throttle_setPrivData(int id, void *new_priv_data);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was added later at Amazon. Do we really need this?


void tokenBucket_capDebt(tokenBucket *bucket, double max_debt);
double tokenBucket_add(tokenBucket *bucket, double tokens);
double tokenBucket_replenish(tokenBucket *bucket);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need for this to be an API. This can happen intrinsically.

Comment on lines +23 to +24
bool tokenBucket_canConsume(tokenBucket *bucket, double tokens);
void tokenBucket_consume(tokenBucket *bucket, double tokens);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need separate canConsume and consume functions?

Eliminate canConsume and change consume to return a bool.

bool tokenBucket_canConsume(tokenBucket *bucket, double tokens);
void tokenBucket_consume(tokenBucket *bucket, double tokens);
double tokenBucket_msUntilAvailable(tokenBucket *bucket, double tokens);
void tokenBucket_halt(tokenBucket *bucket);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need a "halt" function. Looks like overdesign.

Comment on lines +15 to +16
double tokenBucket_getBucketSize(tokenBucket *bucket);
double tokenBucket_getMaxBurstTime(tokenBucket *bucket);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unclear if these are actually needed.

typedef struct tokenBucket tokenBucket;

// APIs
tokenBucket *tokenBucket_create(double tokens_per_sec, double max_burst_time_secs, bucketSizeFunc *bucket_size_func);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is bucket_size_function? How is it used? Why is it needed? Is this overdesign?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants