Add Core Concepts Tutorial by nvrohanv · Pull Request #217 · ai-dynamo/grove

nvrohanv · 2025-10-15T21:35:13Z

Adding tutorial for introducing core Grove Primitives. Examples can be run on local kind cluster
Allowing make kind-up to create arbitrary number of fake nodes

Signed-off-by: Rohan Varma <rohanv@nvidia.com>

gflarity

Looks good overall, just a few suggestions around organization mostly. Please take a look and let me know if you have any questions.

gflarity · 2025-10-17T13:57:22Z

Oh, one more thing. I think a quickstart would also be useful (that doesn't involve the fakes). It's the first thing I look for a POC.

## Motivation During hands-on testing of the Grove installation process, several critical usability issues were discovered that would block new users from successfully deploying Grove. Additionally, the README was too verbose and didn't quickly communicate the core value proposition to developers evaluating the project. ## Changes Made ### installation.md - Fixed Critical Blockers **Working Directory Confusion** - Added explicit "Navigate to operator directory" instructions - Impact: Users can now follow the guide linearly without trial-and-error **KUBECONFIG Setup Broken** - kind-up script has a bug and doesn't export KUBECONFIG properly - Added manual workaround using `kind get kubeconfig` - Impact: Users can now successfully deploy after creating kind cluster **Wrong Resource Names** - Fixed: simple1-0-pcsg → simple1-0-sga (actual resource name) - Impact: Scaling examples now work as documented **Added Troubleshooting Section** - Covers deployment issues, runtime issues, and community resources - Impact: Users can self-serve when encountering common issues ### README.md - Refocused on Problem → Solution → Action **Shortened from ~80 lines to ~40 lines of core content** New structure: 1. Problem First: What's broken in K8s for AI inference 2. Solution: Grove's one-liner positioning 3. Quick Start: 4 commands to deploy in 5 minutes 4. What Grove Solves: Table mapping scenarios to capabilities 5. How It Works: Simplified concept explanations Roadmap simplified to Q4 2025 / Q1 2026 (removed specific outdated dates) Impact: Users understand value prop in 30 seconds and can start immediately ### quickstart.md - New 10-Minute Tutorial - Explains the 4-component example architecture - Step-by-step deployment with expected outputs - Demonstrates both PCSG and PCS scaling - Includes hierarchy visualization - Kind-specific troubleshooting tips Impact: New users get immediate success experience in 10 minutes ## Testing Performed All changes validated through fresh kind cluster deployment on macOS, following installation.md step-by-step, and verifying all examples work. Co-authored-by: Claude <noreply@anthropic.com>

…badge - Replace verbose technical description with problem-first approach - Add "One API. Any inference architecture." tagline for clarity - Include Quick Start section for immediate value demonstration - Add "What Grove Solves" table mapping use cases to capabilities - Simplify "How It Works" section with concise concept table - Add DeepWiki badge for community Q&A support - Update roadmap to use Q4 2025/Q1 2026 format Co-Authored-By: Claude <noreply@anthropic.com>

renormalize

1/n as I've not gotten a chance to look through the entire PR yet.

Co-authored-by: Geoff Flarity <geoff.flarity@gmail.com> Signed-off-by: Anish <80174047+athreesh@users.noreply.github.com>

Co-authored-by: Saketh Kalaga <51327242+renormalize@users.noreply.github.com> Signed-off-by: Anish <80174047+athreesh@users.noreply.github.com>

gflarity

Just moving this to approve to avoid friction. We discussed some of the comments in a meeting.

Co-authored-by: Geoff Flarity <geoff.flarity@gmail.com> Signed-off-by: Anish <80174047+athreesh@users.noreply.github.com>

Co-authored-by: Saketh Kalaga <51327242+renormalize@users.noreply.github.com> Signed-off-by: Anish <80174047+athreesh@users.noreply.github.com>

**README.md:** - Remove `kind get kubeconfig` command (already handled by Makefile) - Add `--watch` flag to demonstrate actual watching behavior **User guide improvements:** - Add inline comments to all podSpec examples clarifying they are standard Kubernetes PodSpecs - Change PodClique comparison from "Deployment" to "ReplicaSet" with gang termination behavior - Clarify blue-green deployment mentions with more specific use cases (canary deployments, A/B testing, high availability) - Add "When to scale what" section explaining when to scale PodCliqueScalingGroup vs individual PodCliques Addresses feedback from: - gflarity: podSpec comments, scaling clarification - renormalize: kubeconfig removal, watch command, blue-green justification, PodClique comparison - athreesh: (previously addressed in earlier commits) Co-Authored-By: Claude <noreply@anthropic.com>

Add "Understanding Scaling Levels" section to overview.md that clearly explains when to scale PCS vs PCSG vs PodClique replicas. This addresses gflarity's feedback requesting clarification on when to increase PCS replicas vs PCSG replicas. The new section provides clear guidance: - Scale PCS for system-level operations (canary, A/B, availability zones) - Scale PCSG to add more multi-node component instances - Scale PodClique to fine-tune individual component pods Co-Authored-By: Claude <noreply@anthropic.com>

Remove `export KUBECONFIG` line from Quick Start section as the Makefile already handles KUBECONFIG configuration automatically for make targets (see operator/Makefile line 30). Addresses renormalize's feedback that this line is not needed. Co-Authored-By: Claude <noreply@anthropic.com>

- Add DeepWiki badge at top of README - Keep improved Quick Start without redundant KUBECONFIG export (per renormalize feedback) - Keep improved installation.md KUBECONFIG instructions - Remove reference to non-existent quickstart.md Resolves conflicts between improved documentation and main branch. Co-Authored-By: Claude <noreply@anthropic.com>

Signed-off-by: Anish <80174047+athreesh@users.noreply.github.com>

Removed construction note and adjusted badge placement.

Signed-off-by: Rohan Varma <rohanv@nvidia.com>

Co-authored-by: Saketh Kalaga <51327242+renormalize@users.noreply.github.com> Signed-off-by: Rohan Varma <rohanv@nvidia.com>

Signed-off-by: Rohan Varma <rohanv@nvidia.com>

renormalize · 2025-11-03T19:38:29Z

@nvrohanv reminder to clean up the git commit message while merging this PR, as the PR has been merged with main multiple times, and there are a significant number of commmits. It would be nice to keep the commit message short.

Co-authored-by: Sanjay Chatterjee <sanjay.chatterjee@gmail.com> Signed-off-by: Rohan Varma <rohanv@nvidia.com>

Signed-off-by: Rohan Varma <rohanv@nvidia.com>

nvrohanv added 2 commits October 15, 2025 14:20

add concept overview doc and demo

2c3bb4f

Signed-off-by: Rohan Varma <rohanv@nvidia.com>

split up core-concepts guide into more readable unit

0a7dd98

Signed-off-by: Rohan Varma <rohanv@nvidia.com>

nvrohanv requested review from sanjaychatterjee and unmarshall as code owners October 15, 2025 21:35

nvrohanv requested a review from athreesh October 15, 2025 21:35

athreesh reviewed Oct 16, 2025

View reviewed changes

Comment thread docs/installation.md

gflarity requested changes Oct 17, 2025

View reviewed changes

athreesh and others added 2 commits October 19, 2025 15:05

renormalize reviewed Oct 23, 2025

View reviewed changes

Comment thread README.md Outdated

Comment thread README.md Outdated

Comment thread docs/installation.md

renormalize suggested changes Oct 23, 2025

View reviewed changes

athreesh and others added 5 commits October 24, 2025 11:55

Update docs/user_guide/overview.md

6f35e6f

Co-authored-by: Geoff Flarity <geoff.flarity@gmail.com> Signed-off-by: Anish <80174047+athreesh@users.noreply.github.com>

Update docs/user_guide/overview.md

8e6297c

Co-authored-by: Geoff Flarity <geoff.flarity@gmail.com> Signed-off-by: Anish <80174047+athreesh@users.noreply.github.com>

Update docs/user_guide/overview.md

f26e99b

Co-authored-by: Geoff Flarity <geoff.flarity@gmail.com> Signed-off-by: Anish <80174047+athreesh@users.noreply.github.com>

Update README.md

e5ffd48

Co-authored-by: Saketh Kalaga <51327242+renormalize@users.noreply.github.com> Signed-off-by: Anish <80174047+athreesh@users.noreply.github.com>

Update README.md

f849cbe

Co-authored-by: Saketh Kalaga <51327242+renormalize@users.noreply.github.com> Signed-off-by: Anish <80174047+athreesh@users.noreply.github.com>

gflarity previously approved these changes Oct 27, 2025

View reviewed changes

Update docs/user_guide/pcs_and_pclq_intro.md

f2028f6

Co-authored-by: Geoff Flarity <geoff.flarity@gmail.com> Signed-off-by: Anish <80174047+athreesh@users.noreply.github.com>

athreesh dismissed gflarity’s stale review via f2028f6 October 28, 2025 16:42

athreesh and others added 10 commits October 28, 2025 09:43

Update README.md

976b39d

Co-authored-by: Saketh Kalaga <51327242+renormalize@users.noreply.github.com> Signed-off-by: Anish <80174047+athreesh@users.noreply.github.com>

Update docs/user_guide/pcsg_intro.md

2470b58

Co-authored-by: Saketh Kalaga <51327242+renormalize@users.noreply.github.com> Signed-off-by: Anish <80174047+athreesh@users.noreply.github.com>

Update docs/user_guide/pcsg_intro.md

9c092a2

Co-authored-by: Saketh Kalaga <51327242+renormalize@users.noreply.github.com> Signed-off-by: Anish <80174047+athreesh@users.noreply.github.com>

Update docs/user_guide/takeaways.md

df64f87

Co-authored-by: Saketh Kalaga <51327242+renormalize@users.noreply.github.com> Signed-off-by: Anish <80174047+athreesh@users.noreply.github.com>

Merge branch 'main' into nvrohanv/add_overview_tutorial

3b6133f

Signed-off-by: Anish <80174047+athreesh@users.noreply.github.com>

removed construction worker + moved deepwiki badge to the right place

cffcce3

Removed construction note and adjusted badge placement.

nvrohanv added 3 commits October 31, 2025 01:50

resolve comments and rename directories to fit k8 conventions

3e7545c

Signed-off-by: Rohan Varma <rohanv@nvidia.com>

clarify pcs replica vs pclq in agg example

0f120ca

Signed-off-by: Rohan Varma <rohanv@nvidia.com>

update example yaml paths

bc06377

Signed-off-by: Rohan Varma <rohanv@nvidia.com>

renormalize reviewed Nov 3, 2025

View reviewed changes

Comment thread README.md Outdated

Comment thread README.md Outdated

Comment thread docs/installation.md

Comment thread operator/hack/kind-up.sh

nvrohanv and others added 2 commits November 3, 2025 08:49

Update README.md

1cf4fbd

Co-authored-by: Saketh Kalaga <51327242+renormalize@users.noreply.github.com> Signed-off-by: Rohan Varma <rohanv@nvidia.com>

remove deepwiki for now

dc92b54

Signed-off-by: Rohan Varma <rohanv@nvidia.com>

renormalize previously approved these changes Nov 3, 2025

View reviewed changes

Merge branch 'main' into nvrohanv/add_overview_tutorial

882da34

sanjaychatterjee reviewed Nov 5, 2025

View reviewed changes

Comment thread README.md Outdated

Comment thread README.md Outdated

Comment thread README.md Outdated

Comment thread README.md Outdated

Comment thread README.md Outdated

Apply suggestions from code review

46cfb8c

Co-authored-by: Sanjay Chatterjee <sanjay.chatterjee@gmail.com> Signed-off-by: Rohan Varma <rohanv@nvidia.com>

nvrohanv dismissed renormalize’s stale review via 46cfb8c November 6, 2025 00:11

nvrohanv added 2 commits November 5, 2025 17:52

update readme

c869218

Signed-off-by: Rohan Varma <rohanv@nvidia.com>

Merge branch 'main' into nvrohanv/add_overview_tutorial

98cd42a

sanjaychatterjee approved these changes Nov 6, 2025

View reviewed changes

nvrohanv merged commit 6b9ae3f into ai-dynamo:main Nov 6, 2025
3 checks passed

Conversation

nvrohanv commented Oct 15, 2025

Uh oh!

Uh oh!

gflarity left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

gflarity commented Oct 17, 2025

Uh oh!

renormalize left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

gflarity left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

renormalize commented Nov 3, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants