Cross-session rate limit guard for Hermes Agent side clients (Nous Portal, gateway, cron, auxiliary) that prevents retry amplification when rate limits are hit.
Each 429 (rate limit) error from a provider triggers up to 9 API calls per conversation turn:
- 3 SDK retries × 3 Hermes retries = 9 calls
- Every call counts against RPH (requests per hour)
- This amplification can quickly exhaust rate limits
This plugin records rate limit state on the first 429 and checks it before subsequent attempts across all sessions (CLI, gateway, cron, auxiliary).
- Cross-Session Tracking: Shared state file accessible by all Hermes sessions
- Retry Amplification Prevention: Records rate limit state on first 429 error
- Plugin Hooks:
pre_llm_call(check rate limit),post_llm_call(record 429 errors) - Runtime Controls:
/ratelimitslash commands for status, clear, enable, disable, set - Config Persistence: Enable/disable status and default cooldown stored in config file
- Thread-Safe: Atomic writes for safe concurrent access
# Clone the repository
git clone https://github.com/LVT382009/hermes-rate-limiter-plugin.git
cd hermes-rate-limiter-plugin
# Run the installer
./install.sh# Create plugin directory
mkdir -p ~/.hermes/plugins/rate-limiter
# Copy plugin files
cp rate-limiter/* ~/.hermes/plugins/rate-limiter/# Clone the repository
git clone https://github.com/LVT382009/hermes-rate-limiter-plugin.git
cd hermes-rate-limiter-plugin
# Copy to Hermes plugins directory
cp -r rate-limiter ~/.hermes/plugins/hermes plugins enable rate-limiter# Check rate limit status
/ratelimit status
# Clear rate limit state
/ratelimit clear
# Enable rate limiter
/ratelimit enable
# Disable rate limiter
/ratelimit disable
# Set default cooldown (seconds)
/ratelimit set 300$ /ratelimit status
Rate limiter: ✓ ENABLED
Default cooldown: 5m
✓ No active rate limit. Requests will proceed normally.
$ /ratelimit set 600
✓ Default cooldown set to 10m.
$ /ratelimit status
Rate limiter: ✓ ENABLED
Default cooldown: 10m
✓ No active rate limit. Requests will proceed normally.
| Hook | Behaviour |
|---|---|
pre_llm_call |
Check if provider is currently rate-limited. If so, skip the request and return early. |
post_llm_call |
If a 429 error is received, parse reset time from headers/error context and record to shared state file. |
Priority order (first match wins):
x-ratelimit-reset-requests-1h- hourly RPH window (most useful)x-ratelimit-reset-requests- per-minute RPM windowretry-after- generic HTTP header- Default cooldown: 5 minutes (300 seconds)
The plugin stores configuration at $HERMES_HOME/rate_limits/config.json:
{
"enabled": true,
"default_cooldown": 300.0
}enabled: Boolean - whether the rate limiter is active (default: true)default_cooldown: Float - default cooldown in seconds (default: 300)
State is stored at $HERMES_HOME/rate_limits/nous.json:
{
"reset_at": 1713792000.0,
"recorded_at": 1713791700.0,
"reset_seconds": 300.0
}Run the test suite:
# From the plugin directory
python3 -m pytest tests/test_rate_limiter_plugin.py -v
# Expected: All 16 tests passTests cover:
- Plugin registration and hook integration
- Slash commands (status, clear, enable, disable, set)
- Config persistence and reload
- Cross-session state persistence
- Thread-safe concurrent access
- Header parsing (priority order, invalid values)
- Rate limit recording (headers, error context, default cooldown)
- Rate limit checking (remaining time, expiration)
- Atomic writes: temp file + rename for safe concurrent access
- Expired entries are automatically cleaned up on read
- State file is scoped to
$HERMES_HOME/rate_limits/ - No external dependencies beyond standard library
- Hermes Agent (https://github.com/NousResearch/hermes-agent)
- Python 3.8+
- No external dependencies
hermes-rate-limiter-plugin/
├── install.sh # Installation script
├── README.md # This file
├── rate-limiter/ # Plugin files
│ ├── __init__.py # Plugin registration
│ ├── plugin.yaml # Plugin manifest
│ ├── rate_limiter.py # Core rate limiting logic
│ ├── commands.py # Slash command handlers
│ └── README.md # Plugin documentation
└── tests/
└── test_rate_limiter_plugin.py # Test suite
# Check if plugin files are in the correct location
ls -la ~/.hermes/plugins/rate-limiter/
# Check plugin status
hermes plugins list# Check if plugin is enabled
hermes plugins list
# Enable the plugin
hermes plugins enable rate-limiter
# Check rate limiter status
/ratelimit status# Make sure install.sh is executable
chmod +x install.sh
# Run installer
./install.shContributions are welcome! Please feel free to submit a Pull Request.
This plugin is part of the Hermes Agent project and follows the same license.
LVT382009 (Le Van Tam)