[docs] refactor: Organize model docs by family#3908
Conversation
Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>
|
/ok to test 0964c9c |
|
Review - Clean docs-only refactor. All old docs/models/llm/ and docs/models/vlm/ references have been swept. No stale references remain in .md, .py, or .yaml files. Cross-links between model pages, examples, and skills are updated correctly. The Sphinx toctree in docs/index.md now lists brand index pages directly, which is consistent. Minor issue - README.md line 16 - The Nemotron-3 Nano Omni news entry text still says available on the nemotron_3_omni branch but the examples README link now points to main. Since the examples directory exists on main, the link is correct, but the surrounding prose is inconsistent. Consider updating the text to say the model is now available on main. This may be a pre-existing issue that this PR surfaced by fixing the link. Suggested test cases - No perf tests impacted. |
Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>
|
/ok to test e0ea428 |
Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>
|
/ok to test 990003a |
There was a problem hiding this comment.
mistral and ministral is arguably the same family
There was a problem hiding this comment.
Folded Ministral into the Mistral docs family: moved the Ministral 3 guide under docs/models/mistral/, removed the separate Ministral family index/toctree entry, and updated docs/example links so Ministral 3 remains discoverable under Mistral.
There was a problem hiding this comment.
Follow-up addressed: moved the Ministral 3 examples from examples/models/ministral/ministral3 to examples/models/mistral/ministral3, removed the empty examples/models/ministral directory, and updated the README/docs/skill references to the new Mistral-family examples path.
| | **Nemotron** | [Nemotron H](https://github.com/NVIDIA-NeMo/Megatron-Bridge/tree/main/src/megatron/bridge/models/nemotronh), [Nemotron Nano v2](https://github.com/NVIDIA-NeMo/Megatron-Bridge/tree/main/src/megatron/bridge/models/nemotronh), [Nemotron-3 Nano](https://github.com/NVIDIA-NeMo/Megatron-Bridge/tree/main/src/megatron/bridge/models/nemotronh), [Nemotron-3 Super](https://github.com/NVIDIA-NeMo/Megatron-Bridge/tree/main/src/megatron/bridge/models/nemotronh), [Llama Nemotron](https://github.com/NVIDIA-NeMo/Megatron-Bridge/tree/main/src/megatron/bridge/models/llama_nemotron), [Nemotron Nano v2 VL](https://github.com/NVIDIA-NeMo/Megatron-Bridge/tree/main/src/megatron/bridge/models/nemotron_vl), [Nemotron-3 Nano Omni](https://github.com/NVIDIA-NeMo/Megatron-Bridge/tree/main/src/megatron/bridge/models/nemotron_omni) | [Nemotron H recipes](https://github.com/NVIDIA-NeMo/Megatron-Bridge/blob/main/src/megatron/bridge/recipes/nemotronh/nemotronh.py), [Nemotron Nano v2 recipes](https://github.com/NVIDIA-NeMo/Megatron-Bridge/blob/main/src/megatron/bridge/recipes/nemotronh/nemotron_nano_v2.py), [Nemotron-3 Nano recipes](https://github.com/NVIDIA-NeMo/Megatron-Bridge/blob/main/src/megatron/bridge/recipes/nemotronh/nemotron_3_nano.py), [Nemotron-3 Super recipes](https://github.com/NVIDIA-NeMo/Megatron-Bridge/blob/main/src/megatron/bridge/recipes/nemotronh/nemotron_3_super.py), [Nemotron Nano v2 VL recipes](https://github.com/NVIDIA-NeMo/Megatron-Bridge/blob/main/src/megatron/bridge/recipes/nemotron_vl/nemotron_nano_v2_vl.py), [Nemotron-3 Nano Omni recipes](https://github.com/NVIDIA-NeMo/Megatron-Bridge/blob/main/src/megatron/bridge/recipes/nemotron_omni/nemotron_omni.py) | | ||
| | **OLMoE** | [OLMoE](https://github.com/NVIDIA-NeMo/Megatron-Bridge/tree/main/src/megatron/bridge/models/olmoe) | [recipes (7B)](https://github.com/NVIDIA-NeMo/Megatron-Bridge/blob/main/src/megatron/bridge/recipes/olmoe/olmoe_7b.py) | | ||
| | **Qwen** | [Qwen2 / Qwen2.5](https://github.com/NVIDIA-NeMo/Megatron-Bridge/tree/main/src/megatron/bridge/models/qwen), [Qwen3](https://github.com/NVIDIA-NeMo/Megatron-Bridge/tree/main/src/megatron/bridge/models/qwen), [Qwen3-MoE](https://github.com/NVIDIA-NeMo/Megatron-Bridge/tree/main/src/megatron/bridge/models/qwen), [Qwen3 Next](https://github.com/NVIDIA-NeMo/Megatron-Bridge/tree/main/src/megatron/bridge/models/qwen), [Qwen2.5-VL](https://github.com/NVIDIA-NeMo/Megatron-Bridge/tree/main/src/megatron/bridge/models/qwen_vl), [Qwen3-VL](https://github.com/NVIDIA-NeMo/Megatron-Bridge/tree/main/src/megatron/bridge/models/qwen_vl), [Qwen3.5-VL](https://github.com/NVIDIA-NeMo/Megatron-Bridge/tree/main/src/megatron/bridge/models/qwen_vl), [Qwen3.6-VL](https://github.com/NVIDIA-NeMo/Megatron-Bridge/tree/main/src/megatron/bridge/models/qwen_vl), [Qwen2 Audio](https://github.com/NVIDIA-NeMo/Megatron-Bridge/tree/main/src/megatron/bridge/models/qwen_audio), [Qwen2.5-Omni](https://github.com/NVIDIA-NeMo/Megatron-Bridge/tree/main/src/megatron/bridge/models/qwen_omni), [Qwen3-Omni](https://github.com/NVIDIA-NeMo/Megatron-Bridge/tree/main/src/megatron/bridge/models/qwen_omni), [Qwen3-ASR](https://github.com/NVIDIA-NeMo/Megatron-Bridge/tree/main/src/megatron/bridge/models/qwen3_asr) | [Qwen2 recipes](https://github.com/NVIDIA-NeMo/Megatron-Bridge/blob/main/src/megatron/bridge/recipes/qwen/qwen2.py), [Qwen3 recipes](https://github.com/NVIDIA-NeMo/Megatron-Bridge/blob/main/src/megatron/bridge/recipes/qwen/qwen3.py), [Qwen3-MoE recipes](https://github.com/NVIDIA-NeMo/Megatron-Bridge/blob/main/src/megatron/bridge/recipes/qwen/qwen3_moe.py), [Qwen3 Next recipes](https://github.com/NVIDIA-NeMo/Megatron-Bridge/blob/main/src/megatron/bridge/recipes/qwen/qwen3_next.py), [Qwen VL recipes](https://github.com/NVIDIA-NeMo/Megatron-Bridge/tree/main/src/megatron/bridge/recipes/qwen_vl), [Qwen Omni recipes](https://github.com/NVIDIA-NeMo/Megatron-Bridge/tree/main/src/megatron/bridge/recipes/qwen_omni), [Qwen examples](https://github.com/NVIDIA-NeMo/Megatron-Bridge/tree/main/examples/models/qwen) | | ||
| | **Sarvam** | [Sarvam](https://github.com/NVIDIA-NeMo/Megatron-Bridge/tree/main/src/megatron/bridge/models/sarvam) | [examples](https://github.com/NVIDIA-NeMo/Megatron-Bridge/tree/main/examples/models/sarvam) | |
There was a problem hiding this comment.
should we link to the readme, if it exists? otherwise this table is a bit busy
There was a problem hiding this comment.
Simplified the README supported-models table: variants are now plain text, and the resources column links to family docs plus README entry points where available instead of long per-recipe link lists. Also updated the Nemotron-3 Nano Omni news item to say the recipes are available on main.
Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>
|
/ok to test dd91f2b |
Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>
|
/ok to test 4afcac2 |
| | **Mamba** | Mamba | [model bridge](https://github.com/NVIDIA-NeMo/Megatron-Bridge/tree/main/src/megatron/bridge/models/mamba) | | ||
| | **MiniMax** | MiniMax-M2 / M2.5 / M2.7 | [examples README](https://github.com/NVIDIA-NeMo/Megatron-Bridge/blob/main/examples/models/minimax/minimax_m2/README.md) | | ||
| | **Mistral** | Mistral, Ministral 3 (3B/8B/14B) | [model docs](https://github.com/NVIDIA-NeMo/Megatron-Bridge/blob/main/docs/models/mistral/index.md), [Ministral 3 examples README](https://github.com/NVIDIA-NeMo/Megatron-Bridge/blob/main/examples/models/mistral/ministral3/README.md) | | ||
| | **MiMo** | MiMo | [Megatron-MiMo training examples](https://github.com/NVIDIA-NeMo/Megatron-Bridge/tree/main/examples/megatron_mimo) | |
There was a problem hiding this comment.
not megatron-mimo, xiaomi-mimo
There was a problem hiding this comment.
Updated the README branding to Xiaomi-MiMo; the supported-models row now links to the new Xiaomi-MiMo docs page.
| | **GPT-OSS** | GPT-oss | [model docs](https://github.com/NVIDIA-NeMo/Megatron-Bridge/blob/main/docs/models/gpt_oss/index.md), [examples README](https://github.com/NVIDIA-NeMo/Megatron-Bridge/blob/main/examples/models/gpt_oss/README.md) | | ||
| | **Kimi** | Kimi K2, Kimi-K2.5-VL | [Kimi-K2.5-VL examples README](https://github.com/NVIDIA-NeMo/Megatron-Bridge/blob/main/examples/models/kimi/kimi_k25_vl/README.md) | | ||
| | **Llama** | Llama 2, Llama 3 / 3.1 / 3.2 / 3.3 | [model docs](https://github.com/NVIDIA-NeMo/Megatron-Bridge/blob/main/docs/models/llama/index.md), [recipes](https://github.com/NVIDIA-NeMo/Megatron-Bridge/tree/main/src/megatron/bridge/recipes/llama) | | ||
| | **Mamba** | Mamba | [model bridge](https://github.com/NVIDIA-NeMo/Megatron-Bridge/tree/main/src/megatron/bridge/models/mamba) | |
There was a problem hiding this comment.
Removed Mamba from the README supported-models table.
| | **Nemotron** | Nemotron H, Nemotron Nano v2, Nemotron-3 Nano, Nemotron-3 Super, Llama Nemotron, Nemotron Nano v2 VL, Nemotron-3 Nano Omni | [model docs](https://github.com/NVIDIA-NeMo/Megatron-Bridge/blob/main/docs/models/nemotron/index.md), [Nemotron-3 README](https://github.com/NVIDIA-NeMo/Megatron-Bridge/blob/main/examples/models/nemotron/nemotron_3/README.md), [Nemotron-3 Omni README](https://github.com/NVIDIA-NeMo/Megatron-Bridge/blob/main/examples/models/nemotron/nemotron_3_omni/README.md) | | ||
| | **OLMoE** | OLMoE | [model docs](https://github.com/NVIDIA-NeMo/Megatron-Bridge/blob/main/docs/models/olmoe/index.md), [recipe](https://github.com/NVIDIA-NeMo/Megatron-Bridge/blob/main/src/megatron/bridge/recipes/olmoe/olmoe_7b.py) | | ||
| | **Qwen** | Qwen2 / Qwen2.5, Qwen3, Qwen3-MoE, Qwen3 Next, Qwen2.5-VL, Qwen3-VL, Qwen3.5-VL, Qwen3.6-VL, Qwen2 Audio, Qwen2.5-Omni, Qwen3-Omni, Qwen3-ASR | [model docs](https://github.com/NVIDIA-NeMo/Megatron-Bridge/blob/main/docs/models/qwen/index.md), [examples directory](https://github.com/NVIDIA-NeMo/Megatron-Bridge/tree/main/examples/models/qwen) | | ||
| | **Sarvam** | Sarvam | [examples README](https://github.com/NVIDIA-NeMo/Megatron-Bridge/blob/main/examples/models/sarvam/README.md) | |
There was a problem hiding this comment.
we dont need last column. for hyperlink, just hyperlink the model doc hyperlink to model name
https://github.com/NVIDIA-NeMo/Megatron-Bridge/blob/main/docs/models/qwen/index.md to qwen.
If there is missing doc page for that model, please add in docs folder. Link the example directory in that model doc
Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>
|
/ok to test b9619df |
Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com> Signed-off-by: Vasudevan Rengasamy <vrengasamy@nvidia.com>
Summary
Validation
fa-brands fa-github.Unit tests were not run per request; this is a docs/navigation refactor.