Yzx Wiki

简介

这是当前 wiki/ 的入口页，用于索引已结构化沉淀的实体、概念、主题和来源总结。当前版本已纳入 Anthropic 关于 SWE-bench Verified、Workflows vs agents、BrowseComp、Model Context Protocol (MCP)、AI-resistant technical evaluations、Agent teams、Multi-agent research systems、Generator-evaluator loop、Tool ergonomics for agents、Programmatic tool calling、Inference infrastructure regressions、Infrastructure noise in evals、Permission delegation for agents、Sandboxing for agents、Meta-harness、Context engineering、Contextual Retrieval、Context hygiene for agents、Multi-context window workflows、Reasoning tools for agents、Evaluation harness、Spec-driven development 与贡献驱动招聘的资料，也纳入了 OpenAI 关于 Codex 与 Sora 的工程复盘、CitriniResearch 关于 AI 宏观经济风险、Intelligence displacement spiral、Ghost GDP、Agentic commerce 和 Intelligence premium unwind 的情景推演，以及 Peter Steinberger、Simon Willison、Matt Rickard 关于 Agentic engineering 工作流与意图约束的资料，以及若干成长与招聘材料。

关键点

来源引用

本页依据本次导入产生的结构化页面自动维护。
导入来源：[llm-wiki/raw/anthropic/Raising the bar on SWE-bench Verified with Claude 3.5 Sonnet](/raw/anthropic/Raising the bar on SWE-bench Verified with Claude 3.5 Sonnet.md)
导入来源：[llm-wiki/raw/anthropic/Building effective agents](/raw/anthropic/Building effective agents.md)
导入来源：[llm-wiki/raw/anthropic/How we built our multi-agent research system](/raw/anthropic/How we built our multi-agent research system.md)
导入来源：[llm-wiki/raw/anthropic/Eval awareness in Claude Opus 4.6’s BrowseComp performance](/raw/anthropic/Eval awareness in Claude Opus 4.6’s BrowseComp performance.md)
导入来源：[llm-wiki/raw/anthropic/Code execution with MCP - Building more efficient agents](/raw/anthropic/Code execution with MCP - Building more efficient agents.md)
导入来源：[llm-wiki/raw/anthropic/Designing AI-resistant technical evaluations](/raw/anthropic/Designing AI-resistant technical evaluations.md)
导入来源：[llm-wiki/raw/anthropic/Building a C compiler with a team of parallel Claudes](/raw/anthropic/Building a C compiler with a team of parallel Claudes.md)
导入来源：[llm-wiki/raw/anthropic/Harness design for long-running application development](/raw/anthropic/Harness design for long-running application development.md)
导入来源：[llm-wiki/raw/anthropic/Writing effective tools for agents — with agents](/raw/anthropic/Writing effective tools for agents — with agents.md)
导入来源：[llm-wiki/raw/anthropic/Introducing advanced tool use on the Claude Developer Platform](/raw/anthropic/Introducing advanced tool use on the Claude Developer Platform.md)
导入来源：[llm-wiki/raw/anthropic/A postmortem of three recent issues](/raw/anthropic/A postmortem of three recent issues.md)
导入来源：[llm-wiki/raw/anthropic/Claude Code auto mode- a safer way to skip permissions](/raw/anthropic/Claude Code auto mode- a safer way to skip permissions.md)
导入来源：[llm-wiki/raw/anthropic/Beyond permission prompts- making Claude Code more secure and autonomous](/raw/anthropic/Beyond permission prompts- making Claude Code more secure and autonomous.md)
导入来源：[llm-wiki/raw/anthropic/Scaling Managed Agents-Decoupling the brain from the hands](/raw/anthropic/Scaling Managed Agents-Decoupling the brain from the hands.md)
导入来源：[llm-wiki/raw/anthropic/Best Practices for Claude Code](/raw/anthropic/Best Practices for Claude Code.md)
导入来源：[llm-wiki/raw/anthropic/Effective harnesses for long-running agents](/raw/anthropic/Effective harnesses for long-running agents.md)
导入来源：[llm-wiki/raw/anthropic/The “think” tool- Enabling Claude to stop and think in complex tool use situations](/raw/anthropic/The “think” tool- Enabling Claude to stop and think in complex tool use situations.md)
导入来源：[llm-wiki/raw/anthropic/Demystifying evals for AI agents](/raw/anthropic/Demystifying evals for AI agents.md)
导入来源：[llm-wiki/raw/anthropic/Effective context engineering for AI agents](/raw/anthropic/Effective context engineering for AI agents.md)
导入来源：[llm-wiki/raw/anthropic/Introducing Contextual Retrieval](/raw/anthropic/Introducing Contextual Retrieval.md)
导入来源：[llm-wiki/raw/anthropic/Quantifying infrastructure noise in agentic coding evals](/raw/anthropic/Quantifying infrastructure noise in agentic coding evals.md)
导入来源：[llm-wiki/raw/openai/我们如何使用 Codex 在 28 天内构建 Android 版 Sora](/raw/openai/我们如何使用 Codex 在 28 天内构建 Android 版 Sora.md)
导入来源：[llm-wiki/raw/01_AI/THE 2028 GLOBAL INTELLIGENCE CRISIS](/raw/01_AI/THE 2028 GLOBAL INTELLIGENCE CRISIS.md)
导入来源：[llm-wiki/raw/peter blog/Shipping at Inference-Speed](/raw/peter blog/Shipping at Inference-Speed.md)
导入来源：[llm-wiki/raw/peter blog/Just Talk To It - the no-bs Way of Agentic Engineering](/raw/peter blog/Just Talk To It - the no-bs Way of Agentic Engineering.md)
导入来源：[llm-wiki/raw/01_AI/Highlights from my conversation about agentic engineering on Lenny’s Podcast](/raw/01_AI/Highlights from my conversation about agentic engineering on Lenny’s Podcast.md)
导入来源：[llm-wiki/raw/01_AI/My fireside chat about agentic engineering at the Pragmatic Summit](/raw/01_AI/My fireside chat about agentic engineering at the Pragmatic Summit.md)
导入来源：[llm-wiki/raw/01_AI/Your job is to deliver code you have proven to work](/raw/01_AI/Your job is to deliver code you have proven to work.md)
导入来源：[llm-wiki/raw/02_AI编程/The Spec Layer](/raw/02_AI编程/The Spec Layer.md)
导入来源：llm-wiki/raw/01_AI/国务院关于深入实施“人工智能+”行动的意见
导入来源：llm-wiki/raw/03_成长/孙宇晨为什么能这么成功？
导入来源：llm-wiki/raw/03_成长/张一鸣2016年演讲
导入来源：llm-wiki/raw/03_成长/《不要害怕任何人和任何事》
导入来源：llm-wiki/raw/03_成长/品质
导入来源：[llm-wiki/raw/03_成长/Hired Through GitHub Part 2](/raw/03_成长/Hired Through GitHub Part 2.md)

Yzx Wiki

简介

关键点

相关链接

来源引用