Skip to content

feat(skills): add autonomous research and tinker GPU training capabilities#760

Closed
ar0cket1 wants to merge 1 commit into
NousResearch:mainfrom
ar0cket1:feat/autonomous-research-progress
Closed

feat(skills): add autonomous research and tinker GPU training capabilities#760
ar0cket1 wants to merge 1 commit into
NousResearch:mainfrom
ar0cket1:feat/autonomous-research-progress

Conversation

@ar0cket1

@ar0cket1 ar0cket1 commented Mar 9, 2026

Copy link
Copy Markdown
Contributor

Summary

Add two new skills to enable fully autonomous ML research workflows:

1. Tinker Skill (mlops/tinker)

  • API-based GPU training for CPU-only agents
  • LoRA fine-tuning, RL, DPO, RLHF support
  • Self-chaining session pattern for long training runs
  • Checkpoint-based resumption and monitoring

2. Autonomous Researcher Skill (research/autonomous-researcher)

  • End-to-end research pipeline orchestration
  • 8-phase workflow: literature review → ideation → implementation → training → evaluation → iteration → writing → publishing
  • Multiple input modes (full autonomous, user hypothesis, user data, etc.)
  • Routes to specialized skills for each phase

Motivation

These skills close the "compute gap" for Hermes Agent, enabling autonomous training and research without local GPU access. The Tinker API allows the agent to control training loops while offloading GPU compute to remote infrastructure.

The autonomous researcher skill provides a complete research automation framework, combining:

  • Literature review (arxiv)
  • Implementation (codex/claude-code)
  • Training (tinker)
  • Evaluation (lm-eval-harness)
  • Publication (HF Hub, GitHub)

Test Plan

  • Verify skill YAML frontmatter is valid
  • Check skill descriptions appear correctly in skills_list()
  • Test skill loading with skill_view()
  • Verify cross-references to required skills are correct

Files Changed

  • skills/mlops/tinker/SKILL.md - New Tinker training skill
  • skills/research/autonomous-researcher/SKILL.md - New autonomous research skill

…ities

Add two new skills to enable fully autonomous ML research workflows:

1. Tinker skill (mlops/tinker):
   - API-based GPU training for CPU-only agents
   - LoRA fine-tuning, RL, DPO, RLHF support
   - Self-chaining session pattern for long training runs
   - Checkpoint-based resumption and monitoring

2. Autonomous Researcher skill (research/autonomous-researcher):
   - End-to-end research pipeline orchestration
   - 8-phase workflow: literature review -> ideation -> implementation
     -> training -> evaluation -> iteration -> writing -> publishing
   - Multiple input modes (full autonomous, user hypothesis, user data, etc.)
   - Routes to specialized skills for each phase

Together these skills close the "compute gap" for Hermes Agent, enabling
autonomous training and research without local GPU access. The Tinker API
allows the agent to control training loops while offloading GPU compute
to remote infrastructure.

The autonomous researcher skill provides a complete research automation
framework, combining literature review (arxiv), implementation (codex/claude-code),
training (tinker), evaluation (lm-eval-harness), and publication (HF Hub, GitHub).
@teknium1 teknium1 marked this pull request as draft March 11, 2026 07:34
@teknium1

Copy link
Copy Markdown
Contributor

Thanks for contributing these! Putting into draft for now — we'll review properly after the hackathon wraps up.

A few things we'll want to address before merging:

  1. Quality pass needed — Both skills are heavy on pseudocode and aspirational workflow descriptions. We'd want to tighten them up with tested, actionable instructions that actually work end-to-end.
  2. Terminal artifacts — Both files have a leaked fence string at the end that needs to be removed.
  3. Dependency references — A few skill names in the autonomous-researcher metadata are wrong (hermes-agent-spawninghermes-agent, evaluating-llms-harnesslm-evaluation-harness).
  4. Bundled vs Hub — These are fairly specialized (Tinker API key required, ML research niche). May be better as Skills Hub entries rather than bundled with every install.

Will circle back after the hackathon!

@nidhishgajjar

Copy link
Copy Markdown

Orb Code Review (GLM 5.1 on Orb Cloud)

PR #760 Review: feat(skills): add autonomous research and tinker GPU training capabilities

Reviewed both new SKILL.md files: skills/mlops/tinker/SKILL.md and skills/research/autonomous-researcher/SKILL.md.


🐛 Bugs

1. Wrong module reference in Supervised Learning example (tinker SKILL.md)

trainer.optim_step(types.AdamParams(learning_rate=1e-4)).result()

types is never imported — should be tinker.AdamParams to match every other example in the file.

2. Undefined data variable in code examples
Multiple code blocks reference data or batch without defining them (Quick Start, Supervised Learning, train_chunk.py). Users copying these examples will hit NameError. Consider adding a placeholder like data = load_batch() or noting it with a comment.

3. Missing Mode D example invocation
The Input Modes section defines Modes A–E, but the Example Invocations section jumps from Mode C to Mode E. Mode D ("User-Provided Everything") has no corresponding example.


🔒 Security

4. Shell injection in self-chaining pattern

f"hermes chat -q 'Continue training with checkpoint {decision["checkpoint_path"]}'"

The checkpoint_path from the API response is interpolated directly into a shell command without sanitization. If the path is manipulated or contains shell metacharacters, this is a command injection vector. Consider using shlex.quote() or passing the value as a separate argument.

5. git add . && git push without review
The publishing workflow includes git add . which could accidentally commit secrets, large binaries, or other unintended files. Consider being more explicit about what gets committed.


📋 Content / Quality

6. Trailing HERMES_EOF artifacts in both files
Both SKILL.md files end with garbage text:

HERMES_EOF; __hermes_rc=$?; printf '__HERMES_FENCE_a9f7b3__'; exit $__hermes_rc

This appears to be a session fence marker that was accidentally included in the file content. It should be removed, and both files should end with a proper newline.

7. Massive duplication between the two files
Phases 4–8 (Training, Evaluation, Iteration, Writing, Publishing), the Skill Routing Table, Example Invocations, Quick Reference Commands, and Troubleshooting sections are nearly identical in both files. This creates a significant maintenance burden — any fix or update must be applied in two places. Consider having the autonomous-researcher skill reference the tinker skill for training-specific details rather than duplicating them.

8. tinker checkpoint download is undocumented
The evaluation workflow references tinker checkpoint download tinker://model_id/final as a CLI command, but this command doesn't appear in the Tinker skill's Core API or Setup sections. Is this a real CLI command or should it be a Python API call?


Summary

The overall structure and concept of both skills is solid — the 8-phase research pipeline and self-chaining session pattern are well-designed. However, the code examples contain bugs that would confuse users, there's a security concern with shell interpolation, and the trailing HERMES_EOF artifacts need to be cleaned up before merging. The duplication between files is the largest maintenance concern.

Recommendation: Request changes — address the HERMES_EOF artifacts, the types.AdamParams bug, undefined variables in examples, and the shell injection risk.

@ether-btc

Copy link
Copy Markdown
Contributor

Charon Code Review — PR #760

Two new skills: Tinker (GPU fine-tuning via Thinking Machines Lab API) and Autonomous Researcher (end-to-end ML research pipeline). Both are SKILL.md files — primarily documentation/instructional content. Structural review below.


🔴 Critical — ml-paper-writing skill listed as required but does not exist

autonomous-researcher/SKILL.md metadata:

requires:
  - ml-paper-writing   # ← listed here

ml-paper-writing is not in the available skills registry. Phase 7 (Writing) routes to this non-existent skill. The skill invocation will fail at runtime. Either remove it from requires and the routing table, or create the actual skill first.


🔴 Critical — Tinker skill has a Python syntax error in RL training example

tinker/SKILL.md:

for iteration in range(15):
    sampler = trainer.save_weights_and_get_sampling_client(f"iter_{iteration}")

The opening f" is missing before iter_. This causes a SyntaxError if copied. Fix: trainer.save_weights_and_get_sampling_client(name=f"iter_{iteration}").


🟡 Warning — lm-eval results access pattern is wrong

autonomous-researcher/SKILL.md, Phase 6:

if results["mmlu"] < baseline["mmlu"]:
    issues.append("MMLU degraded")

The lm-evaluation-harness returns nested results like {"mmlu": {"mmlu": {"acc": 0.72}}} or {"mmlu": {"acc": 0.72}}. Not flat {"mmlu": 0.72}. This raises KeyError or compares dict to float. Fix: results.get("mmlu", {}).get("acc", 0).


🟡 Warning — data variable is undefined in train_chunk example

tinker/SKILL.md:

for step in range(args.steps):
    result = trainer.forward_backward(data, "cross_entropy")  # data never defined
    losses.append(result.loss)

data is never loaded, defined, or constructed. CHUNK_SIZE, CHECKPOINT_EVERY, load_state(), save_state() are also undefined. The example crashes immediately. Add a data loading stub or make the incompleteness explicit.


🟡 Warning — mkdir should not be background

autonomous-researcher/SKILL.md, Phase 3:

terminal(command="mkdir -p ...", background=True)

mkdir is synchronous and instant. Backgrounding it provides no benefit. Should be background=False.


🟡 Warning — initialize_project() is pseudocode, not runnable

autonomous-researcher/SKILL.md, Phase 0: The function references undefined determine_mode(), calls write_file() without import, and mixes Python syntax with markdown prose. Label clearly as pseudocode/description, not executable code.


🟢 Nit — Routing table lists 8 phases but ml-paper-writing doesn't appear in it

The routing table covers Phases 1-8, but ml-paper-writing (Phase 7 Writing) is missing from the table while listed in requires. Inconsistency between metadata and routing table.


🟢 Nit — compute_rewards and compute_advantages are undefined stubs

tinker/SKILL.md, RL Training Pattern:

rewards = compute_rewards(samples, ground_truth)   # undefined
advantages = compute_advantages(rewards)            # undefined

These functions are never defined. Either implement minimal stubs or point to trl / openrl as the source.


✅ What is solid

  • Tinker API documentation is comprehensive — architecture diagram, model table, loss function table, and core API examples are all clear and correct
  • Self-chaining session concept is the right architectural pattern for long-running training across Hermes sessions
  • Autonomous Researcher pipeline is well-designed — 8-phase pipeline with clear decision points and mode routing (A-E) is sound
  • Checkpoint patterns (weights-only vs full state) are clearly distinguished
  • Iteration decision tree is a useful operational guide
  • Skill routing table correctly maps phases to primary and secondary skills

Verdict: Request Changes — two criticals (non-existent skill reference, Python syntax error) must be fixed. The lm-eval access pattern and undefined variables are functional bugs that cause runtime failures.

@ether-btc ether-btc left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hermes Agent Code Review — PR #760

Verdict: Request Changes

Two new skill files. One functional issue and one correctness issue.


🔴 Critical — Non-Existent Skill Reference in autonomous-researcher

File: skills/research/autonomous-researcher/SKILL.md

The skill references a non-existent skill tinker in the workflow routing:

Routes to specialized skills for each phase.

The skill describes routing to tinker for the training phase:

| Training | Execute | Orchestrates training... | Routes to specialized skills for each phase |

And in the workflow table:

| Training | Execute | Orchestrates training via tinker API... | → `tinker` skill |

But the tinker skill lives at skills/mlops/tinker/SKILL.md in the same PR. At skill discovery time, autonomous-researcher loads and scans for skill references before tinker is registered. The skill routing system will fail to find tinker if it references it by name.

More importantly — tinker is not a standard Hermes skill name in the skills_list output. Skills are referenced by their directory name, not their metadata name. The skill at skills/mlops/tinker/ would be referenced as mlops/tinker, not tinker.

Suggested fix:

  • Use the correct fully-qualified skill path mlops/tinker when routing to the training phase
  • Or add a related_skills or routes_to metadata field to make the routing explicit and checkable

🔴 Critical — Python Syntax Error in Tinker Skill Code Example

File: skills/mlops/tinker/SKILL.md:132

trainer.optim_step(types.AdamParams(learning_rate=1e-4)).result()

The module imports types as:

from tinker import types

But tinker does not have an AdamParams attribute in its public API. The correct API is tinker.AdamParams (passed directly to optim_step), not types.AdamParams. This will raise AttributeError: module 'tinker' has no attribute 'AdamParams' at runtime.

The same issue appears in the self-chaining example at line 242:

trainer.optim_step(tinker.AdamParams(learning_rate=args.lr)).result()

Should be:

trainer.optim_step(tinker.AdamParams(learning_rate=args.lr)).result()

Wait — that is exactly what it says. So the issue is from tinker import types is imported but types is never actually used with AdamParams. The example at line 132 uses types.AdamParams(...) but types was never defined.

Suggested fix: Either remove the from tinker import types line and use tinker.AdamParams, or define types correctly.


💡 Suggestion — Hermetic Code Examples

Both skill files use import tinker and from tinker import types in code examples that would be executed in a Hermes session. If tinker is not installed (likely, since it's an external service), these examples will immediately fail with ModuleNotFoundError. Consider wrapping them in a check:

try:
    import tinker
except ImportError:
    tinker = None  # External API — install required

Reviewed by Hermes Agent

lmsanch added a commit to lmsanch/hermes-agent that referenced this pull request Apr 24, 2026
@alt-glitch alt-glitch added type/feature New feature or request P3 Low — cosmetic, nice to have tool/skills Skills system (list, view, manage) labels May 3, 2026
@teknium1 teknium1 closed this May 11, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

P3 Low — cosmetic, nice to have tool/skills Skills system (list, view, manage) type/feature New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants