Skip to content

[Feature]: before-send hook for outbound message filtering ⁠ #24618

@metacrafttech

Description

@metacrafttech

Summary

Before-send hook for outbound message filtering ⁠

Problem to solve

When an agent's LLM response includes chain-of-thought reasoning alongside the intended reply, the entire output is delivered to the end user. There is no hook to inspect or filter outbound messages before delivery.

We've had 3 incidents in 2 weeks where internal processing text leaked to external WhatsApp group chats and DMs — things like "Let me check who this contact is" and full system status dumps.
Use cases:
•⁠ ⁠Internal processing leak prevention (our case — 3 incidents)
•⁠ ⁠PII redaction before delivery
•⁠ ⁠Content safety scanning
•⁠ ⁠Audit logging of all outbound messages
•⁠ ⁠Rate limiting per-recipient
•⁠ ⁠Language/tone enforcement per-group
Related: #20246, #12960

Proposed solution

Add a ⁠ message:before-send ⁠ hook event that fires before any outgoing message is delivered to a channel. The hook should support:
1.⁠ ⁠Inspecting the message text, target channel, and session context
2.⁠ ⁠Modifying the message content (rewrite)
3.⁠ ⁠Cancelling the message entirely (return ⁠ { cancel: true } ⁠)

This would slot into ⁠ normalizeReplyPayload() ⁠ or ⁠ deliverOutboundPayloads() ⁠ — after NO_REPLY/HEARTBEAT_OK stripping but before channel-specific delivery.

Alternatives considered

•⁠ ⁠SOUL.md / system prompt rules → agent ignores under cognitive load
•⁠ ⁠⁠ agent:bootstrap ⁠ hook to inject rules → still prompt-level, can't block delivery
•⁠ ⁠⁠ heartbeat.target: "none" ⁠ → only fixes heartbeat-specific leaks
•⁠ ⁠Custom outbound-guard plugin with 26 regex patterns → detects leaks but can't intercept deliver

Impact

Running a multi-channel agent (WhatsApp, Telegram, Slack, Discord) serving both the owner and external contacts. Internal reasoning text has leaked to external users 3 times in 2 weeks — damaging trust and professionalism. There is currently no architectural way to prevent this. Prompt-level rules fail under cognitive load. The only real fix is a delivery-layer hook that can intercept and block before the message reaches the recipient. Without this, any agent serving external users on messaging platforms is one bad inference away from exposing internal operations to the wrong person.

Evidence/examples

No response

Additional information

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions