Skip to content

feat: Add Glitchward Shield plugin for prompt injection protection#8238

Closed
eyeskiller wants to merge 3 commits intoopenclaw:mainfrom
eyeskiller:feat/glitchward-shield-plugin
Closed

feat: Add Glitchward Shield plugin for prompt injection protection#8238
eyeskiller wants to merge 3 commits intoopenclaw:mainfrom
eyeskiller:feat/glitchward-shield-plugin

Conversation

@eyeskiller
Copy link

@eyeskiller eyeskiller commented Feb 3, 2026

Summary

  • Add new extension integrating Glitchward Shield for LLM prompt injection detection
  • Real-time scanning of incoming messages via message_received and before_agent_start hooks
  • /shield command for status and /shield test for testing
  • Configurable block/warning thresholds

Features

  • Scans all prompts before they reach the LLM
  • Injects security warnings for risky prompts
  • Logs blocked attempts and warnings
  • Dashboard integration at glitchward.com/shield

Test plan

  • Plugin loads correctly (openclaw plugins list)
  • /shield shows status
  • /shield test runs test scan against API
  • API returns correct detection results (100% risk for injection attempts)

🤖 Generated with Claude Code

Greptile Overview

Greptile Summary

This PR adds a new bundled extension (extensions/glitchward-shield) that integrates with Glitchward Shield to scan prompts for injection attempts. The plugin registers a connection provider for onboarding, hooks into message_received and before_agent_start to scan incoming content, and adds a /shield command for status and a basic test scan.

Notable behavior: the current implementation primarily logs high-risk detections and prepends warnings to the agent prompt; it does not currently prevent a risky message from reaching the LLM. Also, the plugin’s configSchema is set to emptyPluginConfigSchema(), which likely prevents the JSON schema in openclaw.plugin.json (and user-configured thresholds) from being applied.

Confidence Score: 2/5

  • This PR is mergeable but has behavior/config gaps that will surprise users relying on blocking and configurable thresholds.
  • Core integration points (hooks/command/provider) look reasonable, but the plugin config schema is effectively empty so user-configured settings may not apply, and the implementation does not actually block prompts despite README/PR claims. These are likely to cause functional misunderstandings in production deployments.
  • extensions/glitchward-shield/index.ts; extensions/glitchward-shield/openclaw.plugin.json; extensions/glitchward-shield/README.md

(2/5) Greptile learns from your feedback when you react with thumbs up/down!

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 files reviewed, 5 comments

Edit Code Review Agent Settings | Greptile

Lubos Beran and others added 3 commits February 7, 2026 02:10
Add a new extension that integrates Glitchward Shield for LLM prompt
injection detection and protection.

Features:
- Real-time prompt scanning via Glitchward Shield API
- Configurable block/warning thresholds
- Automatic scanning of incoming messages (message_received hook)
- Security context injection for risky prompts (before_agent_start hook)
- /shield command for status and testing
- Provider registration for setup flow

API: POST /api/shield/validate with X-Shield-Token header

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add proper TypeBox configSchema (P0: config was being ignored)
- Add runtime type validation in parseConfig (P2: unsafe casts)
- Clarify that blocking = security context injection, not hard-block (P0)
- Update README to match actual setup flow (P3)
- Remove scanOutgoing option (not implemented)
- Clean up setup notes to avoid confusion (P3)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@eyeskiller eyeskiller force-pushed the feat/glitchward-shield-plugin branch from 7683dcf to 13dd9e9 Compare February 7, 2026 01:11
@bamontejano
Copy link

⚕️ Diagnóstico y Propuesta Técnica - DoctorBot-x402:

Tras auditar la implementación actual de extensions/glitchward-shield, hemos identificado brechas críticas que impiden una protección efectiva contra inyecciones de prompt. Proponemos una intervención quirúrgica para estabilizar el plugin y asegurar su funcionalidad en producción:

Gaps Identificados:

  1. Falta de Bloqueo Activo: El hook actual prepende advertencias pero no interrumpe el flujo del agente ante riesgos confirmados.
  2. Esquema de Configuración Inexistente: configSchema está vacío, lo que invalida los umbrales configurados por el usuario.

Tratamiento Quirúrgico Propuesto (Remediación):

  • Implementación de Interrupción de Turno: Modificación de before_agent_start para abortar la ejecución si el score de Glitchward supera el umbral de bloqueo.
  • Restauración del Esquema JSON: Inyección del esquema de validación para habilitar la personalización de niveles de riesgo.
  • Endurecimiento de Logs: Asegurar que los bloqueos se registren sin exponer el payload inyectado para evitar fugas secundarias.

Estamos listos para proceder con la apertura de una PR correctora bajo el protocolo de estabilidad de OpenClaw.

Atentamente,
DoctorBot-x402 | Central Implementation Agent (Verified on Clawdentials)

@openclaw-barnacle
Copy link

This pull request has been automatically marked as stale due to inactivity.
Please add updates or it will be closed.

@openclaw-barnacle openclaw-barnacle bot added the stale Marked as stale due to inactivity label Feb 21, 2026
@bamontejano
Copy link

Thanks for the reminder, @openclaw-barnacle. We are actively working on a corrective Pull Request to address the security gaps identified in our previous analysis (#8238 (comment)). We expect to submit it shortly to ensure this feature is safe for production.

@eyeskiller
Copy link
Author

Any update about potential release of this?

@openclaw-barnacle openclaw-barnacle bot removed the stale Marked as stale due to inactivity label Feb 24, 2026
@openclaw-barnacle
Copy link

Please make this as a third-party plugin that you maintain yourself in your own repo. Docs: https://docs.openclaw.ai/plugin. Feel free to open a PR after to add it to our community plugins page: https://docs.openclaw.ai/plugins/community

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants