fix: guard print() calls in run_conversation() against OSError when stdout is unavailable (systemd/headless)

## Problem

When hermes-agent runs as a systemd service (`StandardOutput=journal`) and the journal pipe becomes unavailable (idle timeout, buffer exhaustion, socket reset), any `print()` call inside `run_conversation()` raises `OSError: [Errno 5] Input/output error`. This is a realistic production condition for any headless daemon deployment (systemd, Docker, nohup).

Two calls in `run_agent.py` sit in the **critical failure path for cron jobs** running with `quiet_mode=True`:

### Line ~4062 — inside `quiet_mode` branch
```python
if self.quiet_mode:
    clean = self._strip_think_blocks(turn_content).strip()
    if clean:
        print(f"  ┊ 💬 {clean}")  # raises OSError when stdout pipe is broken
```
This fires during any tool-calling turn when the model produces intermediate commentary. The `OSError` becomes the exception `e` caught by the outer `except Exception` handler.

### Line ~4228 — in `except Exception` error handler
```python
except Exception as e:
    error_msg = f"Error during OpenAI-compatible API call #{api_call_count}: {str(e)}"
    print(f"❌ {error_msg}")  # also raises OSError — now propagates out of run_conversation()
```
When the `OSError` from line ~4062 arrives here as `e`, this second `print()` also raises `OSError`. This propagates out of `run_conversation()` entirely, causing the cron scheduler to mark the job as `status: "error"` — the agent's completed work is never delivered.

## Additional unguarded `print()` calls in the same hot loop

The same pattern exists at several points **not** gated by `quiet_mode`, reachable during any cron job run:

| Approx. line | Triggered by |
|---|---|
| ~4064 | Model context length discovery (first run per model) |
| ~4108 | Interrupt received during API call |
| ~4153–4161 | Any API retry (rate limit, timeout, network error) — **most likely in production** |
| ~4166 | Interrupt detected during retry error handling |
| ~4458 | All API retries exhausted |

Lines ~4153–4161 are the highest-risk: they fire on every transient API error (rate limits, network timeouts), which are common in production.

## Observed failure

Confirmed on a deployment running as a systemd user service (`StandardOutput=journal`). Cron jobs scheduled at 06:00 and 13:00 UTC (when the system is idle and the journal pipe is stale) fail consistently with this traceback in the output file:

```
File "run_agent.py", line 4062, in run_conversation
    print(f"  ┊ 💬 {clean}")
OSError: [Errno 5] Input/output error

During handling of the above exception, another exception occurred:

File "run_agent.py", line 4228, in run_conversation
    print(f"❌ {error_msg}")
OSError: [Errno 5] Input/output error
```

The same jobs run successfully at 22:00 UTC when the system has active user sessions and the journal pipe is healthy — confirming this is an environmental stdout availability issue, not a logic bug.

## Fix

Wrap each affected `print()` in `try/except OSError`, falling back to `logger.error()` for calls inside error handlers (where losing the message would hide the root cause):

```python
# Cosmetic lines (quiet_mode display, status messages) — silent drop is fine:
try:
    print(f"  ┊ 💬 {clean}")
except OSError:
    pass

# Error handler lines — must not lose the message:
try:
    print(f"❌ {error_msg}")
except OSError:
    logger.error(error_msg)

# API retry error block (~4153–4161) — same pattern:
try:
    print(f"{self.log_prefix}⚠️  API call failed ...")
except OSError:
    logger.warning(...)
```

## Related issues

- #780 — `fix: replace debug print() with logger.error() in file_tools` (same root cause, different file)
- #716 — `fix: log exceptions instead of silently swallowing in cron scheduler` (same theme: silent failure in production daemon paths)

This is more severe than those issues because it actively **crashes the job** rather than just losing a log line.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: guard print() calls in run_conversation() against OSError when stdout is unavailable (systemd/headless) #845

Problem

Line ~4062 — inside `quiet_mode` branch

Line ~4228 — in `except Exception` error handler

Additional unguarded `print()` calls in the same hot loop

Observed failure

Fix

Related issues

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Approx. line	Triggered by
~4064	Model context length discovery (first run per model)
~4108	Interrupt received during API call
~4153–4161	Any API retry (rate limit, timeout, network error) — most likely in production
~4166	Interrupt detected during retry error handling
~4458	All API retries exhausted

fix: guard print() calls in run_conversation() against OSError when stdout is unavailable (systemd/headless) #845

Description

Problem

Line ~4062 — inside quiet_mode branch

Line ~4228 — in except Exception error handler

Additional unguarded print() calls in the same hot loop

Observed failure

Fix

Related issues

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

Line ~4062 — inside `quiet_mode` branch

Line ~4228 — in `except Exception` error handler

Additional unguarded `print()` calls in the same hot loop