Skip to content

[PERFORMANCE]: Precompile regex patterns across plugins #1834

@crivetimihai

Description

@crivetimihai

Summary

Multiple plugins compile regex patterns at runtime per invocation. Precompiling during plugin initialization reduces CPU overhead.

Evidence (current code)

  • plugins/regex_filter/search_replace.py: re.sub with pattern strings per payload value.
  • plugins/sql_sanitizer/sql_sanitizer.py: _strip_sql_comments / _find_issues use pattern strings per call.
  • plugins/html_to_markdown/html_to_markdown.py: _strip_tags uses many inline regexes.
  • plugins/markdown_cleaner/markdown_cleaner.py: _clean_md uses inline regexes.
  • plugins/json_repair/json_repair.py: _repair uses inline regexes per attempt.
  • plugins/argument_normalizer/argument_normalizer.py: _merge_overrides uses re.search with string patterns.
  • plugins/code_safety_linter/code_safety_linter.py: tool_post_invoke uses inline regex patterns.
  • plugins/content_moderation/content_moderation.py: custom patterns use re.search/re.findall per request.
  • plugins/virus_total_checker/virus_total_checker.py: URL and allow/deny patterns compiled per hook.

Impact

  • CPU overhead per plugin invocation; scales with request rate and payload size.

Proposed fix

  • Precompile regex patterns during plugin initialization or config load.
  • Store compiled re.Pattern objects in plugin state/config and reuse per invocation.

Acceptance criteria

  • No per-request re.compile in the above plugins.
  • Same matching behavior and output.

Metadata

Metadata

Assignees

Labels

SHOULDP2: Important but not vital; high-value items that are not crucial for the immediate releaseperformancePerformance related items

Type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions