Skip to content

[BUG][PERFORMANCE]: Fix high-impact performance issues in llm-guard plugin #1960

@araujof

Description

@araujof

🐞 Bug Summary

The LLM-Guard plugin has high-impact performance issues that cause noticeable latency during content scanning operations. These need to be optimized for production use.


🧩 Affected Component

  • mcpgateway - API
  • mcpgateway - UI (admin panel)
  • mcpgateway.wrapper - stdio wrapper
  • Federation or Transports
  • CLI, Makefiles, or shell scripts
  • Container setup (Docker/Podman/Compose)
  • Other: plugins/llm-guard

🔍 Performance Issues Identified

Issue Impact Priority
Levenshtein calculation High CPU per scan P1
Missing scan result caching Redundant processing P1
Policy expressions not pre-compiled Repeated compilation P2
Context update overhead Per-request cost P2
Vault expiry checking Synchronous blocking P3

📋 Tasks

  • Optimize Levenshtein calculation

    • Consider using python-Levenshtein C extension
    • Implement early termination for obvious non-matches
    • Cache distance calculations for repeated comparisons
  • Implement scan result caching

    • Add content hash-based caching
    • Configure TTL for cache entries
    • Add cache hit/miss metrics
  • Pre-compile policy expressions

    • Compile regex patterns at plugin initialization
    • Cache compiled policies per scanner instance
    • Validate policies at startup, not per-request
  • Reduce context update overhead

    • Batch context updates where possible
    • Use incremental updates instead of full rebuilds
    • Profile and optimize hot paths
  • Optimize vault expiry checking

    • Move expiry checks to background task
    • Implement lazy expiry validation
    • Add async expiry checking

🎯 Success Criteria

  • Average scan latency reduced by 50%
  • P99 latency under 100ms for typical content
  • Cache hit rate > 80% for repeated content
  • No increase in memory usage
  • All existing tests pass

📊 Metrics to Track

# Add these metrics to the plugin
llm_guard_scan_duration_seconds
llm_guard_cache_hits_total
llm_guard_cache_misses_total
llm_guard_levenshtein_duration_seconds
llm_guard_policy_compile_duration_seconds

🔗 Related Issues

Metadata

Metadata

Labels

MUSTP1: Non-negotiable, critical requirements without which the product is non-functional or unsafebugSomething isn't workingperformancePerformance related itemsplugins

Type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions