Skip to content

Agent executes code block contents as shell commands instead of writing to files #39233

@yaseenkadlemakki

Description

@yaseenkadlemakki

Summary

When an agent is given a task that includes code blocks intended to be written to files, it sometimes executes the code block contents directly as shell commands instead of creating the files first. This causes cascading failures as Python syntax is interpreted by the shell.

Observed behaviour

Given a prompt like:

Create sentinel/remediation/actions.py with this content:

from dataclasses import dataclass
from enum import Enum
class RemediationStatus(str, Enum):
    PENDING = 'pending'

The agent ran the code directly in the shell:

zsh:1: command not found: python
zsh:2: command not found: from
zsh:3: command not found: from
zsh:4: command not found: from
zsh:7: no matches found: RemediationStatus(str, Enum):
zsh:8: command not found: PENDING

Instead of writing the content to the file using a write tool call or cat > file.py << 'EOF' heredoc.

Root cause hypothesis

The agent conflates "show the code" with "run the code". When the prompt contains a code block adjacent to a file path, the agent should infer the intent is file creation — not shell execution. This is especially likely when:

  1. The file extension is .py, .ts, .yaml, etc. (not a shell script)
  2. The code contains Python/TS syntax that is obviously not valid shell
  3. The surrounding context says "create" or "implement" rather than "run"

Expected behaviour

The agent should:

  1. Recognise that a python block adjacent to a file path is file content, not a command
  2. Use the write tool (or equivalent) to create the file
  3. Only execute shell commands when the intent is clearly execution (e.g. run, execute, test)
  4. When in doubt, prefer writing to a file over running as a command — the failure mode of writing is recoverable; the failure mode of running arbitrary Python as shell is catastrophic

Suggested fix

Add a pre-execution heuristic: if the command contains Python/TS/YAML syntax tokens (import, from, def, class, interface, :at line end, etc.) and does not start with a known shell keyword or binary, warn or refuse to execute and suggest writing to a file instead.

Alternatively, improve the system prompt guidance to explicitly distinguish between "write file" and "execute command" intents when code blocks are present.

Repro context

  • Agent: Claude Sonnet 4.6 via OpenClaw main session
  • Task: Implement a multi-file Python module from a structured prompt
  • Surface: Discord channel (exec via claude -p --dangerously-skip-permissions subprocess)
  • Prompt style: Inline code blocks with file paths and class definitions

Impact

High — causes complete task failure and requires human intervention to restart. Particularly damaging in multi-file implementation tasks where a single misfire aborts the entire sequence.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions