Skip to content

Issue 1959 [PERFORMANCE]: Fix critical performance issues in llm-guard plugin#2200

Merged
crivetimihai merged 1 commit intoIBM:mainfrom
tedhabeck:issue-1959
Jan 24, 2026
Merged

Issue 1959 [PERFORMANCE]: Fix critical performance issues in llm-guard plugin#2200
crivetimihai merged 1 commit intoIBM:mainfrom
tedhabeck:issue-1959

Conversation

@tedhabeck
Copy link
Copy Markdown
Collaborator

🐛 Bug-fix PR

Before opening this PR please:

  1. make lint - passes ruff, mypy, pylint
  2. make test - all unit + integration tests green
  3. make coverage - ≥ 90 %
  4. make docker docker-run-ssl or make podman podman-run-ssl
  5. Update relevant documentation.
  6. Tested with sqlite and postgres + redis.
  7. Manual regression no longer fails. Ensure the UI and /version work correctly.

📌 Summary

#1959

🔁 Reproduction Steps

LOG_LEVEL=ERROR
python tests/performance/test_plugins_performance.py --details 2> /dev/null

🐞 Root Cause

#1959

💡 Fix Description

prescriptively followed details in epic #1958

🧪 Verification

LOG_LEVEL=ERROR
python tests/performance/test_plugins_performance.py --details 2> /dev/null

✅ Checklist

  • Code formatted (make black isort pre-commit)
  • No secrets/credentials committed

@crivetimihai crivetimihai changed the title Issue 1959 Issue 1959 [PERFORMANCE]: Fix critical performance issues in llm-guard plugin Jan 20, 2026
@tedhabeck
Copy link
Copy Markdown
Collaborator Author

tedhabeck commented Jan 20, 2026

Analysis:

Plugin                            P:post       P:pre      R:post       R:pre      T:post       T:pre
----------------------------------------------------------------------------------------------------
Baseline:
LLMGuardPlugin                  22.748ms   116.705ms           —           —           —           —
After cache update:
LLMGuardPlugin                  23.611ms   130.061ms           —           —           —           —
After scanner threading update:
LLMGuardPlugin                   0.148ms     0.120ms           —           —           —           —
After pickle threading:
LLMGuardPlugin                   0.215ms     0.249ms           —           —           —           —
After eval() replaced with ast and orjson update:
LLMGuardPlugin                   0.153ms     0.139ms           —           —           —           —

@tedhabeck tedhabeck marked this pull request as draft January 20, 2026 19:04
@crivetimihai crivetimihai added this to the Release 1.0.0-RC1 milestone Jan 21, 2026
@tedhabeck tedhabeck marked this pull request as ready for review January 22, 2026 20:22
@monshri monshri self-requested a review January 22, 2026 21:08
…scanners

- Convert cache.py to use async redis (redis.asyncio) for non-blocking I/O
- Add parallel scanner execution using asyncio.gather in input/output filters
- Add asyncio.to_thread for CPU-bound scanner operations
- Quiet llm_guard logger to ERROR level to reduce noise
- Fix tests to use prompt_id instead of deprecated name parameter
- Update test to use environment variables for redis host/port

Security: Scanner errors now fail-closed (is_valid=False) instead of being
skipped, ensuring policy evaluation denies requests when scanners fail.

Closes IBM#1959

Signed-off-by: Mihai Criveti <crivetimihai@gmail.com>
@crivetimihai
Copy link
Copy Markdown
Member

Rebase and Review Summary

I've rebased this PR onto main and made the following adjustments:

Changes Made During Rebase

  1. Resolved conflicts with merged PRs llmguard: switch from pickle to orjson for cache serialization #2179 and llmguard: replace raw eval() with a safe AST-based evaluator #2180

  2. Security Fix: Fail-closed behavior for scanner errors

    • Original implementation skipped failed scanners, which caused a fail-open security issue
    • When a scanner failed, its result was missing from policy evaluation
    • Policy evaluation returned "Invalid expression" (truthy), causing requests to be allowed
    • Fix: Failed scanners now return is_valid=False with risk_score=1.0, ensuring denial
  3. Test fixes

    • Updated tests to use prompt_id instead of deprecated name parameter
    • Added environment variable support for redis host/port in tests

Remaining Questions

  1. Thread safety of llm_guard scanners: Are filter scanners guaranteed thread-safe when called concurrently via asyncio.to_thread? If scanners mutate shared state, concurrent requests could race. The parallelization only applies to filters (likely stateless), while sanitizers still run sequentially.

  2. Regression test for fail-closed path: Should we add a test that injects a failing scanner and asserts denial to prevent reintroducing fail-open behavior?

Files Changed

  • cache.py - Async redis with orjson (not pickle)
  • llmguard.py - Parallel scanners with fail-closed error handling
  • plugin.py - Added await calls
  • tests/test_llmguardplugin.py - Fixed payload parameters

@crivetimihai crivetimihai merged commit 2b8f535 into IBM:main Jan 24, 2026
52 checks passed
kcostell06 pushed a commit to kcostell06/mcp-context-forge that referenced this pull request Feb 24, 2026
…scanners (IBM#2200)

- Convert cache.py to use async redis (redis.asyncio) for non-blocking I/O
- Add parallel scanner execution using asyncio.gather in input/output filters
- Add asyncio.to_thread for CPU-bound scanner operations
- Quiet llm_guard logger to ERROR level to reduce noise
- Fix tests to use prompt_id instead of deprecated name parameter
- Update test to use environment variables for redis host/port

Security: Scanner errors now fail-closed (is_valid=False) instead of being
skipped, ensuring policy evaluation denies requests when scanners fail.

Closes IBM#1959

Signed-off-by: Mihai Criveti <crivetimihai@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants