Preflight Checklist
Problem Statement
Skills Context Overflow Problem
Current Situation:
- I've authored 100+ specialized skills organized behind ~10 router skills
- Router skills are designed to dynamically load the appropriate specialized skill on-demand
- Each skill's name + description consumes ~30-50 tokens in Claude's global context
The Problem:
Without a way to hide or prioritize skills, all 100+ skills load their metadata into context automatically, causing:
- Router crowding: The 10 router skills that Claude should prioritize are buried among 100+ specialized skills, making skill discovery unreliable
- Wasted context budget: ~3,000-5,000 tokens consumed by metadata for skills that should only load on-demand when routed to
- Scalability ceiling: Can only load 1-2 skill packs before hitting context limits, defeating the purpose of the router pattern
Proposed Solution
What's Needed:
Two complementary mechanisms:
- Hidden Skills (context efficiency)
Mark specialized skills as "hidden" so they don't consume context until explicitly invoked:
name: python-engineering:async-debugging
description: Deep async/await debugging patterns
hidden: true # Don't load metadata into global context
- Router Priority (skill discovery)
Mark router skills with higher priority so Claude checks them first:
name: python-engineering
description: Routes to specialized Python skills based on problem type
router: true # Prioritize during skill selection
Combined Effect:
- Claude sees only 10 router skills in context (~300-500 tokens)
- Routers are checked first during skill discovery
- Specialized skills load on-demand when router invokes them
- Pattern scales to hundreds of skills without context bloat
Example Hierarchy:
✓ python-engineering (router: true, visible, ~40 tokens)
├── python-engineering:async-debugging (hidden: true, 0 tokens until invoked)
├── python-engineering:testing-patterns (hidden: true, 0 tokens until invoked)
└── python-engineering:performance-profiling (hidden: true, 0 tokens until invoked)
✓ pytorch-engineering (router: true, visible, ~40 tokens)
├── pytorch-engineering:tensor-debugging (hidden: true, 0 tokens until invoked)
└── pytorch-engineering:distributed-training (hidden: true, 0 tokens until invoked)
Benefits:
- Supports large-scale skill libraries (100+ skills)
- Efficient context usage (only routers in global context)
- Reliable skill discovery (routers prioritized)
- Enables proper separation of concerns (routing vs. specialized knowledge)
Alternative Solutions
Right now the manual work around is adding and disabling plugins as required however this requires reloading the claudecode and is severely disruptive.
Priority
High - Significant impact on productivity
Feature Category
Configuration and settings
Use Case Example
Use Case Example
Scenario: A developer working on a PyTorch training loop encounters a CUDA out-of-memory error.
Without hidden/router fields (Current Behavior):
Developer: "My PyTorch training is crashing with CUDA OOM errors"
Claude's context at session start:
Available skills (112 total, ~4,000 tokens):
- python-engineering
- python-engineering:async-debugging
- python-engineering:testing-patterns
- python-engineering:performance-profiling
- pytorch-engineering
- pytorch-engineering:tensor-debugging
- pytorch-engineering:distributed-training
- pytorch-engineering:memory-profiling
- pytorch-engineering:mixed-precision
... (104 more skills)
Problem: Claude has to scan through 112 skill descriptions to find the right match. The pytorch-engineering router is buried. Claude might randomly pick pytorch-engineering:memory-profiling directly, bypassing the router's triage
logic, or miss it entirely and not use any skill.
With hidden/router fields (Proposed Behavior):
Developer: "My PyTorch training is crashing with CUDA OOM errors"
Claude's context at session start:
Available skills (10 routers, ~400 tokens):
- python-engineering (router)
- pytorch-engineering (router)
- react-engineering (router)
- testing-workflows (router)
- database-engineering (router)
... (5 more routers)
Step-by-step flow:
- Skill Discovery: Claude scans 10 router descriptions, immediately identifies pytorch-engineering router matches keywords: "PyTorch", "training", "CUDA"
- Router Invocation: Claude invokes pytorch-engineering router
- Router Logic: Router reads the error details and routes to the appropriate specialized skill:
Symptoms: CUDA OOM, training loop
→ Loading pytorch-engineering:memory-profiling
- On-Demand Loading: The hidden skill pytorch-engineering:memory-profiling loads its full content (0 tokens → ~2,000 tokens)
- Problem Solving: Specialized skill provides deep diagnostic steps:
- Check batch size vs. available VRAM
- Inspect model parameter count
- Check for memory leaks (detached tensors, cached gradients)
- Suggest gradient accumulation or mixed-precision training
Benefits in this scenario:
- ✅ Fast skill discovery (10 vs. 112 skills to scan)
- ✅ Correct routing logic applied (router triages the problem)
- ✅ Efficient context usage (only loads relevant specialized skill)
- ✅ Deep expertise when needed (full skill content available after routing)
Another Example: Multi-Domain Projects
Developer working on a full-stack ML application:
Session 1: "I need to optimize my FastAPI endpoints"
→ Sees python-engineering router
→ Routes to python-engineering:async-patterns
→ Context: ~400 (routers) + ~2,000 (one skill) = ~2,400 tokens
Session 2: "My React frontend is re-rendering too much"
→ Sees react-engineering router
→ Routes to react-engineering:performance-optimization
→ Context: ~400 (routers) + ~2,000 (one skill) = ~2,400 tokens
Session 3: "My PyTorch model won't fit on the GPU"
→ Sees pytorch-engineering router
→ Routes to pytorch-engineering:memory-profiling
→ Context: ~400 (routers) + ~2,000 (one skill) = ~2,400 tokens
Without hidden/router fields:
- Each session starts with ~4,000 tokens of skill metadata
- Random skill selection bypasses router logic
- Can't install all three skill packs (context overflow)
With hidden/router fields:
- Each session starts with ~400 tokens of router metadata
- Reliable routing to specialized skills
- All three skill packs installed and working efficiently
Additional Context
No response
Preflight Checklist
Problem Statement
Skills Context Overflow Problem
Current Situation:
The Problem:
Without a way to hide or prioritize skills, all 100+ skills load their metadata into context automatically, causing:
Proposed Solution
What's Needed:
Two complementary mechanisms:
Mark specialized skills as "hidden" so they don't consume context until explicitly invoked:
name: python-engineering:async-debugging
description: Deep async/await debugging patterns
hidden: true # Don't load metadata into global context
Mark router skills with higher priority so Claude checks them first:
name: python-engineering
description: Routes to specialized Python skills based on problem type
router: true # Prioritize during skill selection
Combined Effect:
Example Hierarchy:
✓ python-engineering (router: true, visible, ~40 tokens)
├── python-engineering:async-debugging (hidden: true, 0 tokens until invoked)
├── python-engineering:testing-patterns (hidden: true, 0 tokens until invoked)
└── python-engineering:performance-profiling (hidden: true, 0 tokens until invoked)
✓ pytorch-engineering (router: true, visible, ~40 tokens)
├── pytorch-engineering:tensor-debugging (hidden: true, 0 tokens until invoked)
└── pytorch-engineering:distributed-training (hidden: true, 0 tokens until invoked)
Benefits:
Alternative Solutions
Right now the manual work around is adding and disabling plugins as required however this requires reloading the claudecode and is severely disruptive.
Priority
High - Significant impact on productivity
Feature Category
Configuration and settings
Use Case Example
Use Case Example
Scenario: A developer working on a PyTorch training loop encounters a CUDA out-of-memory error.
Without hidden/router fields (Current Behavior):
Developer: "My PyTorch training is crashing with CUDA OOM errors"
Claude's context at session start:
Available skills (112 total, ~4,000 tokens):
... (104 more skills)
Problem: Claude has to scan through 112 skill descriptions to find the right match. The pytorch-engineering router is buried. Claude might randomly pick pytorch-engineering:memory-profiling directly, bypassing the router's triage
logic, or miss it entirely and not use any skill.
With hidden/router fields (Proposed Behavior):
Developer: "My PyTorch training is crashing with CUDA OOM errors"
Claude's context at session start:
Available skills (10 routers, ~400 tokens):
... (5 more routers)
Step-by-step flow:
Symptoms: CUDA OOM, training loop
→ Loading pytorch-engineering:memory-profiling
- Check batch size vs. available VRAM
- Inspect model parameter count
- Check for memory leaks (detached tensors, cached gradients)
- Suggest gradient accumulation or mixed-precision training
Benefits in this scenario:
Another Example: Multi-Domain Projects
Developer working on a full-stack ML application:
Session 1: "I need to optimize my FastAPI endpoints"
→ Sees python-engineering router
→ Routes to python-engineering:async-patterns
→ Context: ~400 (routers) + ~2,000 (one skill) = ~2,400 tokens
Session 2: "My React frontend is re-rendering too much"
→ Sees react-engineering router
→ Routes to react-engineering:performance-optimization
→ Context: ~400 (routers) + ~2,000 (one skill) = ~2,400 tokens
Session 3: "My PyTorch model won't fit on the GPU"
→ Sees pytorch-engineering router
→ Routes to pytorch-engineering:memory-profiling
→ Context: ~400 (routers) + ~2,000 (one skill) = ~2,400 tokens
Without hidden/router fields:
With hidden/router fields:
Additional Context
No response