Skip to content

llmguard: switch from pickle to orjson for cache serialization#2179

Merged
crivetimihai merged 1 commit intoIBM:mainfrom
RinZ27:refactor/llmguard-safe-cache
Jan 23, 2026
Merged

llmguard: switch from pickle to orjson for cache serialization#2179
crivetimihai merged 1 commit intoIBM:mainfrom
RinZ27:refactor/llmguard-safe-cache

Conversation

@RinZ27
Copy link
Copy Markdown
Contributor

@RinZ27 RinZ27 commented Jan 19, 2026

Swapping out pickle for orjson in the LLMGuard cache to address the deserialization risks mentioned in #2156. Since orjson is already in the project, I figured it's the cleanest way to harden this. I've also made sure it handles the tuple-to-array shift so the vault logic doesn't break.

  • Replaced pickle with orjson in cache.py.
  • Added recursive conversion for JSON arrays back into tuples.
  • No new dependencies added.

@RinZ27 RinZ27 requested a review from crivetimihai as a code owner January 19, 2026 09:10
@RinZ27 RinZ27 force-pushed the refactor/llmguard-safe-cache branch 2 times, most recently from 2038de5 to 3883e1a Compare January 19, 2026 09:20
@crivetimihai crivetimihai added this to the Release 1.0.0-RC1 milestone Jan 20, 2026
@araujof
Copy link
Copy Markdown
Member

araujof commented Jan 20, 2026

@tedhabeck FYI

@tedhabeck
Copy link
Copy Markdown
Collaborator

Pickle performance:

Plugin                            P:post       P:pre      R:post       R:pre      T:post       T:pre
----------------------------------------------------------------------------------------------------
LLMGuardPlugin                  26.069ms   128.718ms           —           —           —           —

orjson performance:

LLMGuardPlugin                  24.170ms   128.957ms           —           —           —           —

@crivetimihai crivetimihai self-assigned this Jan 20, 2026
@crivetimihai crivetimihai force-pushed the refactor/llmguard-safe-cache branch from 3883e1a to a423a69 Compare January 20, 2026 19:18
@crivetimihai
Copy link
Copy Markdown
Member

crivetimihai commented Jan 20, 2026

Hi @RinZ27 - thank you for this contribution! I've rebased it, and made some modifications. I think this still needs a discussion around sanitization of data, but looking forward to your input on this.

Review Changes Applied

This PR has been rebased onto main and enhanced during review. Here's a summary of the changes made beyond the original commit:

Fixes Applied

Issue Fix
Critical: _convert_to_tuple returned Tuple[Tuple] instead of List[Tuple] Renamed to _convert_to_list_of_tuples(), returns List[Tuple] as expected by Vault._tuples
Medium: orjson.dumps TypeError not handled Added try/except TypeError with graceful return (False, False)
Medium: orjson.loads JSONDecodeError not handled Added try/except orjson.JSONDecodeError returning None
Medium: PII logged in info-level logs Changed to debug level, logs only key + size (no vault contents)
Low: Corrupted cache entries not cleaned up Delete corrupted/invalid entries on retrieval failure
Low: Type hints inaccurate Fixed to `-> list
Low: Unexpected JSON shapes could crash downstream Treat as cache miss, delete from cache, return None
Low: Cache miss logged as error Changed to logger.debug (normal condition)

Final Diff Summary

  • Robust error handling for serialization and deserialization
  • PII-safe logging throughout
  • Type-correct return values

Trade-offs: orjson vs pickle

Why This Change is Important

Security (Primary Motivation):

  • pickle.loads() can execute arbitrary code during deserialization — while this would require write access to Redis (cache poisoning), it's better to use a different approach.
  • orjson only deserializes JSON data types — no code execution possible

Comparison

Aspect pickle orjson
Security ❌ Arbitrary code execution risk ✅ Safe — only JSON types
Type Support ✅ Any Python object ⚠️ JSON types only (strings, numbers, bools, lists, dicts, null)
Tuples ✅ Native support ⚠️ Become arrays, need conversion back
Performance Good ✅ 3-10x faster for JSON-serializable data
Human Readable ❌ Binary blob ✅ JSON text (debuggable)
Cross-language ❌ Python only ✅ Universal format

Trade-offs Accepted

  1. Type Restrictions: Vault data must be JSON-serializable. Verified that vault tuples are always List[Tuple[str, str, str]] containing only strings — no issue for this use case.

  2. Tuple Conversion: JSON doesn't have a tuple type, so List[Tuple] becomes List[List] during serialization. The _convert_to_list_of_tuples() function handles the conversion back.

  3. Behavioral Change: If non-JSON-serializable data is ever stored (e.g., bytes, UUID, custom objects), the cache write will now fail gracefully instead of working. This is intentional — it surfaces data shape issues early rather than silently storing potentially problematic data.

Alternatives Considered

  • json stdlib: Slower than orjson, same type restrictions
  • msgpack: Binary format, faster than JSON, but adds a new dependency
  • orjson with default handler: Could serialize more types, but adds complexity and may mask data issues

Conclusion

The security benefits of removing pickle outweigh the minor complexity of tuple conversion. The vault data structure is well-defined (string tuples only), making orjson a clean fit. Error handling ensures graceful degradation if assumptions are ever violated.

@crivetimihai crivetimihai force-pushed the refactor/llmguard-safe-cache branch from a423a69 to aaecaea Compare January 20, 2026 19:42
@crivetimihai crivetimihai requested a review from monshri January 20, 2026 19:45
@monshri
Copy link
Copy Markdown
Collaborator

monshri commented Jan 20, 2026

@crivetimihai One question that needs to be answered at the plugin level is do we want to throw error/PluginError when cache serialization fails or just log it? Right now, it's logging the serialization error and sending a False, False to the plugin.py.

Rest looks good to me.

@RinZ27
Copy link
Copy Markdown
Contributor Author

RinZ27 commented Jan 21, 2026

Just jumping in with my two cents here... I agree with the current approach (logging the error and returning False, False). Since caching is an optimization/auxiliary mechanism here, a serialization failure shouldn't block the main processing flow or crash the plugin execution. Graceful degradation seems safer for production.

@RinZ27
Copy link
Copy Markdown
Contributor Author

RinZ27 commented Jan 21, 2026

Appreciate the rebase and the extra polish on the function names and types, @crivetimihai. It's definitely looking much cleaner now. Also, thanks @tedhabeck for running those benchmarks—glad to see orjson giving us that slight performance edge too.

Signed-off-by: RinCodeForge927 <dangnhatrin90@gmail.com>
Signed-off-by: RinZ27 <222222878+RinZ27@users.noreply.github.com>
@crivetimihai crivetimihai force-pushed the refactor/llmguard-safe-cache branch from aaecaea to 07cdacd Compare January 23, 2026 17:51
@crivetimihai crivetimihai merged commit 391fca8 into IBM:main Jan 23, 2026
52 checks passed
@RinZ27 RinZ27 deleted the refactor/llmguard-safe-cache branch January 24, 2026 09:43
kcostell06 pushed a commit to kcostell06/mcp-context-forge that referenced this pull request Feb 24, 2026
)

Signed-off-by: RinCodeForge927 <dangnhatrin90@gmail.com>
Signed-off-by: RinZ27 <222222878+RinZ27@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants