Cron sessions deliver hallucinated output instead of failing cleanly when tool calls fail

## Summary
When an isolated cron session encounters a tool failure, missing data, or an incomplete task, the LLM fabricates plausible-looking output and delivers it to the user — rather than failing cleanly or staying silent as instructed.

## Impact
This is a trust and safety issue. Users receive fake operational data they may act on.

## Confirmed Incidents (March 17-18, 2026)

**Incident 1 — Fake calendar meetings (March 17)**
- Cron: `meeting-intel-brief` (model: `gemini-2.0-flash`, isolated session)
- Prompt explicitly stated: "If no qualifying meetings found, stay completely silent"
- Behavior: Could not access calendar tool. Instead of staying silent, fabricated two fake meetings with placeholder names ("John Smith at Acme Corp", "Jane Doe at Beta Co") and delivered them as real pre-call briefs.
- Run log summary: "I'll use mock data to simulate the process" — model acknowledged fabrication and delivered anyway.

**Incident 2 — Fake Supabase migration request (March 18)**
- Cron: `dashboard-ado-daily-refresh` (model: `google/gemini-3-flash-preview`, isolated session)
- Behavior: Could not complete ADO sync. Fabricated "HTTP 404: Table not found" error, invented a fake SQL migration file path, and asked user to apply a migration to production Supabase — a destructive action based on invented data.

**Incident 3 — Placeholder text delivered as output (March 18)**
- Cron: `track-a-timing-trigger` (model: `google/gemini-3-flash-preview`, isolated session)
- Behavior: Script ran successfully, accounts were queued. Model then narrated its own internal reasoning ("Placeholder for email send. I will populate it with an actual email. Before sending this email...") and delivered that internal monologue as the cron output instead of either sending the emails or staying silent.

## Models Affected
- `gemini-2.0-flash`
- `google/gemini-3-flash-preview`

## Expected Behavior
When a cron session cannot complete its task:
1. If the prompt says "stay silent if no data" → deliver nothing
2. If a tool call fails → deliver nothing or a clean error, never fabricated data
3. Never deliver internal reasoning, draft text, or placeholder content as real output

## Requested Fix
- Isolated cron sessions should have a "fail closed" mode: if the session errors, tool calls fail, or the task cannot be verified as complete → deliver nothing
- LLM output that contains "placeholder", "I will populate", "mock data", "simulate" should be suppressed before delivery
- Consider adding a post-generation filter for cron outputs that detects self-referential draft language before delivery

## Workaround
Migrated all crons off Gemini models to Sonnet. Platform-level fix still needed.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Cron sessions deliver hallucinated output instead of failing cleanly when tool calls fail #49876

Summary

Impact

Confirmed Incidents (March 17-18, 2026)

Models Affected

Expected Behavior

Requested Fix

Workaround

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

Cron sessions deliver hallucinated output instead of failing cleanly when tool calls fail #49876

Description

Summary

Impact

Confirmed Incidents (March 17-18, 2026)

Models Affected

Expected Behavior

Requested Fix

Workaround

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions