docs: search agent dev note#350
Conversation
Greptile SummaryThis PR adds a new dev note documenting the search agent SFT data pipeline used to generate tool-use trajectories for training Nemotron Super's web browsing capabilities, along with two supporting images and a nav entry in The post is thorough and well-written, covering motivation, the 4-stage pipeline (Wikidata KG walks → riddle generation → trajectory rollouts → SFT post-processing), yield analysis, correctness challenges, and a Data Designer MCP integration walkthrough. Previously flagged issues from earlier review rounds have been largely resolved — the Key changes:
|
| Filename | Overview |
|---|---|
| docs/devnotes/posts/search-agent.md | New dev note documenting the search agent SFT pipeline; comprehensive and well-structured. Previously flagged issues (missing image, missing script file, OpenAI-spec clarification, yield numbers) have been resolved. One minor typo remains: a double slash in the GTC 2026 URL. |
| mkdocs.yml | Adds "Search Agent" to the Dev Notes nav and reorders RQA Dataset below Design Principles, with an explanatory comment about most-recent-first ordering. Change is minimal and correct. |
| docs/devnotes/posts/images/browsecomp-benchmark-results.jpg | New benchmark results image referenced from search-agent.md; no issues. |
| docs/devnotes/posts/images/wikidata-graph-walk.png | New Wikidata graph walk illustration referenced from search-agent.md; previously flagged as missing, now included. No issues. |
Flowchart
%%{init: {'theme': 'neutral'}}%%
flowchart TD
A["🌐 Wikidata Knowledge Graph\n(100M+ entities)"] -->|SPARQL random walks\n4–8 hops, anti-meta filters| B["Stage 1: Seed Data\n50,000 JSONL records\nseed_entity → final_answer_entity"]
B -->|74% pass filter| C["Stage 2a: Draft Riddle\nLLM — chain clues,\nhide intermediate nodes"]
C -->|65% valid| D["Stage 2b: BrowseComp Obfuscation\nLLM — concise, no breadcrumbs,\n1–2 sentences"]
D -->|24,000 valid questions| E["Stage 3: Trajectory Rollouts\nMiniMax-M2 + Tavily MCP\nThought → Tool Call → Observation loop\n~12 tool calls/sample"]
E -->|29% correct & complete| F["Stage 4: Post-Processing\nNormalize tool outputs\nDrop truncated\nSelect best rollout\nWrite OpenAI-messages JSONL"]
F --> G["~7,000 SFT Records\n14% end-to-end yield\nNemotron Super 0% → 31.28% BrowseComp"]
style A fill:#1a1a2e,color:#eee
style B fill:#16213e,color:#eee
style C fill:#16213e,color:#eee
style D fill:#16213e,color:#eee
style E fill:#0f3460,color:#eee
style F fill:#0f3460,color:#eee
style G fill:#533483,color:#eee
Prompt To Fix All With AI
This is a comment left during a code review.
Path: docs/devnotes/posts/search-agent.md
Line: 585
Comment:
**Double slash in GTC 2026 URL**
The URL contains a double slash after the domain (`nvidia.com//gtc/`), which is a typo. Most servers will redirect it, but it looks unprofessional in published documentation and may break for some link validators or users on strict proxies.
```suggestion
6. [GTC 2026 Workshop: Building Search Agents with NeMo Data Designer](https://www.nvidia.com/gtc/session-catalog/sessions/gtc26-dlit81572/)
```
How can I resolve this? If you propose a fix, please make it concise.Last reviewed commit: ecec9a7
- Update date to 2026-03-11 (mvansegbroeck) - Rewrite intro: keep agentic shift opener, replace middle paragraphs with BrowseComp-first framing per mvansegbroeck suggestion - Fix trajectory example: "hour-angle coordinate system" is incorrect, changed to "first equatorial coordinate system" (mvansegbroeck) - Move stale Wikidata ground truth discussion earlier into Step 1 as motivation, with forward link to Correctness Challenge (mvansegbroeck) - Move safety controls from Key Takeaway #7 into MCP Integration section where it fits better (mvansegbroeck) - Add GTC 2026 workshop link to Key Resources (mvansegbroeck) - Replace outdated Try For Yourself code (LocalStdioMCPProvider, Qwen model, no Stage 4) with minimal inline snippet + full recipe include using hosted Tavily MCP endpoint (same pattern as text-to-sql)
PR feedback: - Update date to 2026-03-12 - Rewrite intro: keep agentic shift opener, replace middle paragraphs with BrowseComp-first framing - Fix trajectory example: "hour-angle coordinate system" -> "first equatorial coordinate system" - Move stale Wikidata ground truth into Step 1 as early motivation - Move safety controls from Key Takeaway #7 into MCP Integration section - Add GTC 2026 workshop link to Key Resources New content: - Add Wikidata graph walk diagram (wikidata-graph-walk.png) to Step 1 - Add BrowseComp Benchmark Results section with bar chart (JPG): Nemotron Super 0% -> 31.28% (SFT + RL), vs GPT-OSS-120B 33.89% - Replace outdated Try For Yourself code with minimal inline snippet + full recipe include via pymdownx.snippets
|
greptile is just wrong in this case with it's suggestions |
Co-authored-by: Johnny Greco <jogreco@nvidia.com>
…IA-NeMo/DataDesigner into dhruv/devnotes/search-agent
johnnygreco
left a comment
There was a problem hiding this comment.
This is great, @dhruvnathawani !
Co-authored-by: Johnny Greco <jogreco@nvidia.com>
Co-authored-by: Johnny Greco <jogreco@nvidia.com>
Summary
Add a dev note documenting the search agent SFT data pipeline used to generate tool-use trajectories for training Nemotron Super's web browsing capabilities.
What's in the post
Files changed