llmguard: switch from pickle to orjson for cache serialization#2179
llmguard: switch from pickle to orjson for cache serialization#2179crivetimihai merged 1 commit intoIBM:mainfrom
Conversation
2038de5 to
3883e1a
Compare
|
@tedhabeck FYI |
|
Pickle performance: orjson performance: |
3883e1a to
a423a69
Compare
|
Hi @RinZ27 - thank you for this contribution! I've rebased it, and made some modifications. I think this still needs a discussion around sanitization of data, but looking forward to your input on this. Review Changes AppliedThis PR has been rebased onto Fixes Applied
Final Diff Summary
Trade-offs: orjson vs pickleWhy This Change is ImportantSecurity (Primary Motivation):
Comparison
Trade-offs Accepted
Alternatives Considered
ConclusionThe security benefits of removing |
a423a69 to
aaecaea
Compare
|
@crivetimihai One question that needs to be answered at the plugin level is do we want to throw error/PluginError when cache serialization fails or just log it? Right now, it's logging the serialization error and sending a Rest looks good to me. |
|
Just jumping in with my two cents here... I agree with the current approach (logging the error and returning |
|
Appreciate the rebase and the extra polish on the function names and types, @crivetimihai. It's definitely looking much cleaner now. Also, thanks @tedhabeck for running those benchmarks—glad to see orjson giving us that slight performance edge too. |
Signed-off-by: RinCodeForge927 <dangnhatrin90@gmail.com> Signed-off-by: RinZ27 <222222878+RinZ27@users.noreply.github.com>
aaecaea to
07cdacd
Compare
Swapping out
picklefororjsonin the LLMGuard cache to address the deserialization risks mentioned in #2156. Sinceorjsonis already in the project, I figured it's the cleanest way to harden this. I've also made sure it handles the tuple-to-array shift so the vault logic doesn't break.picklewithorjsonincache.py.