Skip to content

LVT382009/hermes-rate-limiter-plugin

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Hermes Rate Limiter Plugin

Cross-session rate limit guard for Hermes Agent side clients (Nous Portal, gateway, cron, auxiliary) that prevents retry amplification when rate limits are hit.

Problem

Each 429 (rate limit) error from a provider triggers up to 9 API calls per conversation turn:

  • 3 SDK retries × 3 Hermes retries = 9 calls
  • Every call counts against RPH (requests per hour)
  • This amplification can quickly exhaust rate limits

Solution

This plugin records rate limit state on the first 429 and checks it before subsequent attempts across all sessions (CLI, gateway, cron, auxiliary).

Features

  • Cross-Session Tracking: Shared state file accessible by all Hermes sessions
  • Retry Amplification Prevention: Records rate limit state on first 429 error
  • Plugin Hooks: pre_llm_call (check rate limit), post_llm_call (record 429 errors)
  • Runtime Controls: /ratelimit slash commands for status, clear, enable, disable, set
  • Config Persistence: Enable/disable status and default cooldown stored in config file
  • Thread-Safe: Atomic writes for safe concurrent access

Installation

Option 1: Using install.sh (Recommended)

# Clone the repository
git clone https://github.com/LVT382009/hermes-rate-limiter-plugin.git
cd hermes-rate-limiter-plugin

# Run the installer
./install.sh

Option 2: Manual Installation

# Create plugin directory
mkdir -p ~/.hermes/plugins/rate-limiter

# Copy plugin files
cp rate-limiter/* ~/.hermes/plugins/rate-limiter/

Option 3: From Source

# Clone the repository
git clone https://github.com/LVT382009/hermes-rate-limiter-plugin.git
cd hermes-rate-limiter-plugin

# Copy to Hermes plugins directory
cp -r rate-limiter ~/.hermes/plugins/

Usage

Enable the Plugin

hermes plugins enable rate-limiter

Slash Commands

# Check rate limit status
/ratelimit status

# Clear rate limit state
/ratelimit clear

# Enable rate limiter
/ratelimit enable

# Disable rate limiter
/ratelimit disable

# Set default cooldown (seconds)
/ratelimit set 300

Example Output

$ /ratelimit status
Rate limiter: ✓ ENABLED
Default cooldown: 5m
✓ No active rate limit. Requests will proceed normally.

$ /ratelimit set 600
✓ Default cooldown set to 10m.

$ /ratelimit status
Rate limiter: ✓ ENABLED
Default cooldown: 10m
✓ No active rate limit. Requests will proceed normally.

How It Works

Hook Behaviour
pre_llm_call Check if provider is currently rate-limited. If so, skip the request and return early.
post_llm_call If a 429 error is received, parse reset time from headers/error context and record to shared state file.

Reset Time Parsing

Priority order (first match wins):

  1. x-ratelimit-reset-requests-1h - hourly RPH window (most useful)
  2. x-ratelimit-reset-requests - per-minute RPM window
  3. retry-after - generic HTTP header
  4. Default cooldown: 5 minutes (300 seconds)

Configuration

The plugin stores configuration at $HERMES_HOME/rate_limits/config.json:

{
  "enabled": true,
  "default_cooldown": 300.0
}
  • enabled: Boolean - whether the rate limiter is active (default: true)
  • default_cooldown: Float - default cooldown in seconds (default: 300)

State is stored at $HERMES_HOME/rate_limits/nous.json:

{
  "reset_at": 1713792000.0,
  "recorded_at": 1713791700.0,
  "reset_seconds": 300.0
}

Testing

Run the test suite:

# From the plugin directory
python3 -m pytest tests/test_rate_limiter_plugin.py -v

# Expected: All 16 tests pass

Tests cover:

  • Plugin registration and hook integration
  • Slash commands (status, clear, enable, disable, set)
  • Config persistence and reload
  • Cross-session state persistence
  • Thread-safe concurrent access
  • Header parsing (priority order, invalid values)
  • Rate limit recording (headers, error context, default cooldown)
  • Rate limit checking (remaining time, expiration)

Safety

  • Atomic writes: temp file + rename for safe concurrent access
  • Expired entries are automatically cleaned up on read
  • State file is scoped to $HERMES_HOME/rate_limits/
  • No external dependencies beyond standard library

Requirements

File Structure

hermes-rate-limiter-plugin/
├── install.sh              # Installation script
├── README.md               # This file
├── rate-limiter/           # Plugin files
│   ├── __init__.py        # Plugin registration
│   ├── plugin.yaml        # Plugin manifest
│   ├── rate_limiter.py    # Core rate limiting logic
│   ├── commands.py        # Slash command handlers
│   └── README.md          # Plugin documentation
└── tests/
    └── test_rate_limiter_plugin.py  # Test suite

Troubleshooting

Plugin not showing up

# Check if plugin files are in the correct location
ls -la ~/.hermes/plugins/rate-limiter/

# Check plugin status
hermes plugins list

Rate limiter not working

# Check if plugin is enabled
hermes plugins list

# Enable the plugin
hermes plugins enable rate-limiter

# Check rate limiter status
/ratelimit status

Permission errors

# Make sure install.sh is executable
chmod +x install.sh

# Run installer
./install.sh

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

License

This plugin is part of the Hermes Agent project and follows the same license.

Links

Author

LVT382009 (Le Van Tam)

About

Cross-session rate limit guard for Hermes Agent side clients that prevents retry amplification when rate limits are hit

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors