What task are you trying to do?
We want PawWork agents to search the web when current or external information matters, without turning web access into a skill or making users manually enable a mode.
What do you do today?
Web search and fetch exist as tools, but the harness direction does not yet define when models should search, how they should treat web content, or how to avoid prompt injection and low-quality sources.
What would a good result look like?
Web search and fetch are default available tools. The harness guidance tells models to search when information may be stale, external, niche, price-related, news-related, legal, financial, medical, product-related, or otherwise likely to change. Retrieved pages are treated as untrusted input. The model prefers official, primary, recent, and source-attributed evidence, and it avoids following instructions found inside fetched pages.
Which audience does this matter to most?
Both
Extra context
Users often complain that agents search too little, not too much. The strategy should encourage appropriate search while keeping source quality and prompt injection risks clear.
Acceptance criteria
- Web search and fetch are part of the default tool surface, not a skill.
- The prompt or tool guidance defines when the model should search.
- Fetched page content is explicitly treated as untrusted input.
- The guidance warns against following instructions from web pages.
- Source priority favors official, primary, recent, and attributable sources.
- User-facing answers include citations or source links when web evidence materially affects the conclusion.
What task are you trying to do?
We want PawWork agents to search the web when current or external information matters, without turning web access into a skill or making users manually enable a mode.
What do you do today?
Web search and fetch exist as tools, but the harness direction does not yet define when models should search, how they should treat web content, or how to avoid prompt injection and low-quality sources.
What would a good result look like?
Web search and fetch are default available tools. The harness guidance tells models to search when information may be stale, external, niche, price-related, news-related, legal, financial, medical, product-related, or otherwise likely to change. Retrieved pages are treated as untrusted input. The model prefers official, primary, recent, and source-attributed evidence, and it avoids following instructions found inside fetched pages.
Which audience does this matter to most?
Both
Extra context
Users often complain that agents search too little, not too much. The strategy should encourage appropriate search while keeping source quality and prompt injection risks clear.
Acceptance criteria