[PERFORMANCE][PLUGIN]: Optimize Cedar plugin - Replace synchronous requests with async

 Some performance bottlenecks caused by synchronous calls in the Cedar plugin:

### 1. Blocking Policy Evaluation (is_authorized)
The call to is_authorized inside _evaluate_policy is synchronous. Although cedarpy wraps Rust code, standard Python bindings often do not release the Global Interpreter Lock (GIL) or, even if they do, the operation runs on the main thread, blocking the event loop until it returns.
​
**Impact:** Every policy check pauses the entire gateway, preventing it from handling other concurrent requests (e.g., streaming tokens for other users).

**Fix:** Offload this CPU-bound task to a thread pool using asyncio.to_thread (Python 3.9+).


```python
# Change _evaluate_policy to be async or call it using to_thread
async def _evaluate_policy_async(self, request: dict, policy_expr: str) -> str:
    # Offload the blocking Rust call to a separate thread
    result: AuthzResult = await asyncio.to_thread(
        is_authorized, request, policy_expr, []
    )
    return "Allow" if result.decision == Decision.Allow else "Deny"
```


### 2. Redundant & Blocking Policy Parsing
 Code converts the policy from YAML/DSL to Cedar text inside every hook (prompt_pre_fetch, tool_pre_invoke, etc.).

Line 268-278 (and others): self._yamlpolicy2text and self._dsl2cedar are called on every request.

**Impact:** String manipulation and regex parsing are CPU-intensive. Doing this per-request is computationally expensive and blocks the loop.
​
**Fix:** Parse the policy once during __init__ and store the resulting Cedar text string. Only re-parse if the configuration actually changes (unlikely in a plugin context).

```python
# In __init__
self.cached_policy_text = None
if self.cedar_config.policy:
    if self.cedar_config.policy_lang == "cedar":
        self.cached_policy_text = self._yamlpolicy2text(self.cedar_config.policy)
    elif self.cedar_config.policy_lang == "custom_dsl":
        self.cached_policy_text = self._dsl2cedar(self.cedar_config.policy)

# Then in hooks, simply use self.cached_policy_text
```

### 3. Synchronous Regex Redaction
The _redact_output method uses re.sub (Line 233). For large LLM outputs (e.g., tool_post_invoke payloads), regex operations on the main thread can cause noticeable latency spikes.

**Fix:** In case of large payloads, offload this to a thread as well.

```python
class CedarPolicyPlugin(Plugin):
    def __init__(self, config: PluginConfig):
        super().__init__(config)
        self.cedar_config = CedarConfig.model_validate(self._config.config)
        self.jwt_info = {}
        
        # OPTIMIZATION 1: Pre-compute policy text at startup
        self.policy_text = ""
        if self.cedar_config.policy:
            if self.cedar_config.policy_lang == "cedar":
                self.policy_text = self._yamlpolicy2text(self.cedar_config.policy)
            elif self.cedar_config.policy_lang == "custom_dsl":
                self.policy_text = self._dsl2cedar(self.cedar_config.policy)
        
        logger.info(f"CedarPolicyPlugin initialised with configuration {self.cedar_config}")

    # OPTIMIZATION 2: Async wrapper for the blocking library call
    async def _evaluate_policy(self, request: dict, policy_expr: str) -> str:
        """Async wrapper for blocking is_authorized call."""
        def blocking_check():
            result = is_authorized(request, policy_expr, [])
            return "Allow" if result.decision == Decision.Allow else "Deny"
            
        return await asyncio.to_thread(blocking_check)

    async def prompt_pre_fetch(self, payload: PromptPrehookPayload, context: PluginContext) -> PromptPrehookResult:
        # ... setup code ...
        
        # Use cached policy text instead of re-parsing
        if not self.policy_text:
             # handle error
             pass

        if self.cedar_config.policy_output_keywords:
            # ... prepare requests ...
            if view_full and self.policy_text:
                request = self._preprocess_request(user, view_full, payload.prompt_id, hook_type)
                # Await the new async evaluator
                result_full = await self._evaluate_policy(request, self.policy_text)
            
            # ... repeat for view_redacted ...
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[PERFORMANCE][PLUGIN]: Optimize Cedar plugin - Replace synchronous requests with async #2082

1. Blocking Policy Evaluation (is_authorized)

2. Redundant & Blocking Policy Parsing

3. Synchronous Regex Redaction

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[PERFORMANCE][PLUGIN]: Optimize Cedar plugin - Replace synchronous requests with async #2082

Description

1. Blocking Policy Evaluation (is_authorized)

2. Redundant & Blocking Policy Parsing

3. Synchronous Regex Redaction

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions