feat(skills): add autonomous research and tinker GPU training capabilities by ar0cket1 · Pull Request #760 · NousResearch/hermes-agent

ar0cket1 · 2026-03-09T10:21:30Z

Summary

Add two new skills to enable fully autonomous ML research workflows:

1. Tinker Skill (`mlops/tinker`)

API-based GPU training for CPU-only agents
LoRA fine-tuning, RL, DPO, RLHF support
Self-chaining session pattern for long training runs
Checkpoint-based resumption and monitoring

2. Autonomous Researcher Skill (`research/autonomous-researcher`)

End-to-end research pipeline orchestration
8-phase workflow: literature review → ideation → implementation → training → evaluation → iteration → writing → publishing
Multiple input modes (full autonomous, user hypothesis, user data, etc.)
Routes to specialized skills for each phase

Motivation

These skills close the "compute gap" for Hermes Agent, enabling autonomous training and research without local GPU access. The Tinker API allows the agent to control training loops while offloading GPU compute to remote infrastructure.

The autonomous researcher skill provides a complete research automation framework, combining:

Literature review (arxiv)
Implementation (codex/claude-code)
Training (tinker)
Evaluation (lm-eval-harness)
Publication (HF Hub, GitHub)

Test Plan

Verify skill YAML frontmatter is valid
Check skill descriptions appear correctly in skills_list()
Test skill loading with skill_view()
Verify cross-references to required skills are correct

Files Changed

skills/mlops/tinker/SKILL.md - New Tinker training skill
skills/research/autonomous-researcher/SKILL.md - New autonomous research skill

…ities Add two new skills to enable fully autonomous ML research workflows: 1. Tinker skill (mlops/tinker): - API-based GPU training for CPU-only agents - LoRA fine-tuning, RL, DPO, RLHF support - Self-chaining session pattern for long training runs - Checkpoint-based resumption and monitoring 2. Autonomous Researcher skill (research/autonomous-researcher): - End-to-end research pipeline orchestration - 8-phase workflow: literature review -> ideation -> implementation -> training -> evaluation -> iteration -> writing -> publishing - Multiple input modes (full autonomous, user hypothesis, user data, etc.) - Routes to specialized skills for each phase Together these skills close the "compute gap" for Hermes Agent, enabling autonomous training and research without local GPU access. The Tinker API allows the agent to control training loops while offloading GPU compute to remote infrastructure. The autonomous researcher skill provides a complete research automation framework, combining literature review (arxiv), implementation (codex/claude-code), training (tinker), evaluation (lm-eval-harness), and publication (HF Hub, GitHub).

teknium1 · 2026-03-11T07:35:08Z

Thanks for contributing these! Putting into draft for now — we'll review properly after the hackathon wraps up.

A few things we'll want to address before merging:

Quality pass needed — Both skills are heavy on pseudocode and aspirational workflow descriptions. We'd want to tighten them up with tested, actionable instructions that actually work end-to-end.
Terminal artifacts — Both files have a leaked fence string at the end that needs to be removed.
Dependency references — A few skill names in the autonomous-researcher metadata are wrong (hermes-agent-spawning → hermes-agent, evaluating-llms-harness → lm-evaluation-harness).
Bundled vs Hub — These are fairly specialized (Tinker API key required, ML research niche). May be better as Skills Hub entries rather than bundled with every install.

Will circle back after the hackathon!

nidhishgajjar · 2026-04-14T20:50:41Z

Orb Code Review (GLM 5.1 on Orb Cloud)

PR #760 Review: feat(skills): add autonomous research and tinker GPU training capabilities

Reviewed both new SKILL.md files: skills/mlops/tinker/SKILL.md and skills/research/autonomous-researcher/SKILL.md.

🐛 Bugs

1. Wrong module reference in Supervised Learning example (tinker SKILL.md)

trainer.optim_step(types.AdamParams(learning_rate=1e-4)).result()

types is never imported — should be tinker.AdamParams to match every other example in the file.

2. Undefined data variable in code examples
Multiple code blocks reference data or batch without defining them (Quick Start, Supervised Learning, train_chunk.py). Users copying these examples will hit NameError. Consider adding a placeholder like data = load_batch() or noting it with a comment.

3. Missing Mode D example invocation
The Input Modes section defines Modes A–E, but the Example Invocations section jumps from Mode C to Mode E. Mode D ("User-Provided Everything") has no corresponding example.

🔒 Security

4. Shell injection in self-chaining pattern

f"hermes chat -q 'Continue training with checkpoint {decision["checkpoint_path"]}'"

The checkpoint_path from the API response is interpolated directly into a shell command without sanitization. If the path is manipulated or contains shell metacharacters, this is a command injection vector. Consider using shlex.quote() or passing the value as a separate argument.

5. git add . && git push without review
The publishing workflow includes git add . which could accidentally commit secrets, large binaries, or other unintended files. Consider being more explicit about what gets committed.

📋 Content / Quality

6. Trailing HERMES_EOF artifacts in both files
Both SKILL.md files end with garbage text:

HERMES_EOF; __hermes_rc=$?; printf '__HERMES_FENCE_a9f7b3__'; exit $__hermes_rc

This appears to be a session fence marker that was accidentally included in the file content. It should be removed, and both files should end with a proper newline.

7. Massive duplication between the two files
Phases 4–8 (Training, Evaluation, Iteration, Writing, Publishing), the Skill Routing Table, Example Invocations, Quick Reference Commands, and Troubleshooting sections are nearly identical in both files. This creates a significant maintenance burden — any fix or update must be applied in two places. Consider having the autonomous-researcher skill reference the tinker skill for training-specific details rather than duplicating them.

8. tinker checkpoint download is undocumented
The evaluation workflow references tinker checkpoint download tinker://model_id/final as a CLI command, but this command doesn't appear in the Tinker skill's Core API or Setup sections. Is this a real CLI command or should it be a Python API call?

Summary

The overall structure and concept of both skills is solid — the 8-phase research pipeline and self-chaining session pattern are well-designed. However, the code examples contain bugs that would confuse users, there's a security concern with shell interpolation, and the trailing HERMES_EOF artifacts need to be cleaned up before merging. The duplication between files is the largest maintenance concern.

Recommendation: Request changes — address the HERMES_EOF artifacts, the types.AdamParams bug, undefined variables in examples, and the shell injection risk.

ether-btc · 2026-04-20T05:41:54Z

Charon Code Review — PR #760

Two new skills: Tinker (GPU fine-tuning via Thinking Machines Lab API) and Autonomous Researcher (end-to-end ML research pipeline). Both are SKILL.md files — primarily documentation/instructional content. Structural review below.

🔴 Critical — `ml-paper-writing` skill listed as required but does not exist

autonomous-researcher/SKILL.md metadata:

requires:
  - ml-paper-writing   # ← listed here

ml-paper-writing is not in the available skills registry. Phase 7 (Writing) routes to this non-existent skill. The skill invocation will fail at runtime. Either remove it from requires and the routing table, or create the actual skill first.

🔴 Critical — Tinker skill has a Python syntax error in RL training example

tinker/SKILL.md:

for iteration in range(15):
    sampler = trainer.save_weights_and_get_sampling_client(f"iter_{iteration}")

The opening f" is missing before iter_. This causes a SyntaxError if copied. Fix: trainer.save_weights_and_get_sampling_client(name=f"iter_{iteration}").

🟡 Warning — lm-eval results access pattern is wrong

autonomous-researcher/SKILL.md, Phase 6:

if results["mmlu"] < baseline["mmlu"]:
    issues.append("MMLU degraded")

The lm-evaluation-harness returns nested results like {"mmlu": {"mmlu": {"acc": 0.72}}} or {"mmlu": {"acc": 0.72}}. Not flat {"mmlu": 0.72}. This raises KeyError or compares dict to float. Fix: results.get("mmlu", {}).get("acc", 0).

🟡 Warning — `data` variable is undefined in train_chunk example

tinker/SKILL.md:

for step in range(args.steps):
    result = trainer.forward_backward(data, "cross_entropy")  # data never defined
    losses.append(result.loss)

data is never loaded, defined, or constructed. CHUNK_SIZE, CHECKPOINT_EVERY, load_state(), save_state() are also undefined. The example crashes immediately. Add a data loading stub or make the incompleteness explicit.

🟡 Warning — mkdir should not be background

autonomous-researcher/SKILL.md, Phase 3:

terminal(command="mkdir -p ...", background=True)

mkdir is synchronous and instant. Backgrounding it provides no benefit. Should be background=False.

🟡 Warning — `initialize_project()` is pseudocode, not runnable

autonomous-researcher/SKILL.md, Phase 0: The function references undefined determine_mode(), calls write_file() without import, and mixes Python syntax with markdown prose. Label clearly as pseudocode/description, not executable code.

🟢 Nit — Routing table lists 8 phases but `ml-paper-writing` doesn't appear in it

The routing table covers Phases 1-8, but ml-paper-writing (Phase 7 Writing) is missing from the table while listed in requires. Inconsistency between metadata and routing table.

🟢 Nit — `compute_rewards` and `compute_advantages` are undefined stubs

tinker/SKILL.md, RL Training Pattern:

rewards = compute_rewards(samples, ground_truth)   # undefined
advantages = compute_advantages(rewards)            # undefined

These functions are never defined. Either implement minimal stubs or point to trl / openrl as the source.

✅ What is solid

Tinker API documentation is comprehensive — architecture diagram, model table, loss function table, and core API examples are all clear and correct
Self-chaining session concept is the right architectural pattern for long-running training across Hermes sessions
Autonomous Researcher pipeline is well-designed — 8-phase pipeline with clear decision points and mode routing (A-E) is sound
Checkpoint patterns (weights-only vs full state) are clearly distinguished
Iteration decision tree is a useful operational guide
Skill routing table correctly maps phases to primary and secondary skills

Verdict: Request Changes — two criticals (non-existent skill reference, Python syntax error) must be fixed. The lm-eval access pattern and undefined variables are functional bugs that cause runtime failures.

ether-btc

Hermes Agent Code Review — PR #760

Verdict: Request Changes

Two new skill files. One functional issue and one correctness issue.

🔴 Critical — Non-Existent Skill Reference in `autonomous-researcher`

File: skills/research/autonomous-researcher/SKILL.md

The skill references a non-existent skill tinker in the workflow routing:

Routes to specialized skills for each phase.

The skill describes routing to tinker for the training phase:

| Training | Execute | Orchestrates training... | Routes to specialized skills for each phase |

And in the workflow table:

| Training | Execute | Orchestrates training via tinker API... | → `tinker` skill |

But the tinker skill lives at skills/mlops/tinker/SKILL.md in the same PR. At skill discovery time, autonomous-researcher loads and scans for skill references before tinker is registered. The skill routing system will fail to find tinker if it references it by name.

More importantly — tinker is not a standard Hermes skill name in the skills_list output. Skills are referenced by their directory name, not their metadata name. The skill at skills/mlops/tinker/ would be referenced as mlops/tinker, not tinker.

Suggested fix:

Use the correct fully-qualified skill path mlops/tinker when routing to the training phase
Or add a related_skills or routes_to metadata field to make the routing explicit and checkable

🔴 Critical — Python Syntax Error in Tinker Skill Code Example

File: skills/mlops/tinker/SKILL.md:132

trainer.optim_step(types.AdamParams(learning_rate=1e-4)).result()

The module imports types as:

from tinker import types

But tinker does not have an AdamParams attribute in its public API. The correct API is tinker.AdamParams (passed directly to optim_step), not types.AdamParams. This will raise AttributeError: module 'tinker' has no attribute 'AdamParams' at runtime.

The same issue appears in the self-chaining example at line 242:

trainer.optim_step(tinker.AdamParams(learning_rate=args.lr)).result()

Should be:

trainer.optim_step(tinker.AdamParams(learning_rate=args.lr)).result()

Wait — that is exactly what it says. So the issue is from tinker import types is imported but types is never actually used with AdamParams. The example at line 132 uses types.AdamParams(...) but types was never defined.

Suggested fix: Either remove the from tinker import types line and use tinker.AdamParams, or define types correctly.

💡 Suggestion — Hermetic Code Examples

Both skill files use import tinker and from tinker import types in code examples that would be executed in a Hermes session. If tinker is not installed (likely, since it's an external service), these examples will immediately fail with ModuleNotFoundError. Consider wrapping them in a check:

try:
    import tinker
except ImportError:
    tinker = None  # External API — install required

Reviewed by Hermes Agent

…tion (NousResearch#760)

teknium1 marked this pull request as draft March 11, 2026 07:34

ether-btc suggested changes Apr 20, 2026

View reviewed changes

lmsanch added a commit to lmsanch/hermes-agent that referenced this pull request Apr 24, 2026

feat(email): add recipient filtering to prevent cross-agent contamina…

ae53a46

…tion (NousResearch#760)

lmsanch mentioned this pull request Apr 24, 2026

feat(toryx): integrate 10 Toryx-specific commits from feat/thompson-sampling-routing (#833) lmsanch/hermes-agent#11

Merged

alt-glitch added type/feature New feature or request P3 Low — cosmetic, nice to have tool/skills Skills system (list, view, manage) labels May 3, 2026

teknium1 closed this May 11, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(skills): add autonomous research and tinker GPU training capabilities#760

feat(skills): add autonomous research and tinker GPU training capabilities#760
ar0cket1 wants to merge 1 commit into
NousResearch:mainfrom
ar0cket1:feat/autonomous-research-progress

ar0cket1 commented Mar 9, 2026

Uh oh!

teknium1 commented Mar 11, 2026

Uh oh!

nidhishgajjar commented Apr 14, 2026

Uh oh!

ether-btc commented Apr 20, 2026

Uh oh!

ether-btc left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

ar0cket1 commented Mar 9, 2026

Summary

1. Tinker Skill (mlops/tinker)

2. Autonomous Researcher Skill (research/autonomous-researcher)

Motivation

Test Plan

Files Changed

Uh oh!

teknium1 commented Mar 11, 2026

Uh oh!

nidhishgajjar commented Apr 14, 2026

PR #760 Review: feat(skills): add autonomous research and tinker GPU training capabilities

🐛 Bugs

🔒 Security

📋 Content / Quality

Summary

Uh oh!

ether-btc commented Apr 20, 2026

Charon Code Review — PR #760

🔴 Critical — ml-paper-writing skill listed as required but does not exist

🔴 Critical — Tinker skill has a Python syntax error in RL training example

🟡 Warning — lm-eval results access pattern is wrong

🟡 Warning — data variable is undefined in train_chunk example

🟡 Warning — mkdir should not be background

🟡 Warning — initialize_project() is pseudocode, not runnable

🟢 Nit — Routing table lists 8 phases but ml-paper-writing doesn't appear in it

🟢 Nit — compute_rewards and compute_advantages are undefined stubs

✅ What is solid

Uh oh!

ether-btc left a comment

Choose a reason for hiding this comment

Hermes Agent Code Review — PR #760

🔴 Critical — Non-Existent Skill Reference in autonomous-researcher

🔴 Critical — Python Syntax Error in Tinker Skill Code Example

💡 Suggestion — Hermetic Code Examples

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

1. Tinker Skill (`mlops/tinker`)

2. Autonomous Researcher Skill (`research/autonomous-researcher`)

🔴 Critical — `ml-paper-writing` skill listed as required but does not exist

🟡 Warning — `data` variable is undefined in train_chunk example

🟡 Warning — `initialize_project()` is pseudocode, not runnable

🟢 Nit — Routing table lists 8 phases but `ml-paper-writing` doesn't appear in it

🟢 Nit — `compute_rewards` and `compute_advantages` are undefined stubs

🔴 Critical — Non-Existent Skill Reference in `autonomous-researcher`