feat(skills): add autoresearch — autonomous git-based experiment loop#5112
feat(skills): add autoresearch — autonomous git-based experiment loop#5112tugrulguner wants to merge 1 commit into
Conversation
c7da4a3 to
7778259
Compare
Q&A (from maintainer review)Evaluation metric for text outputsKarpathy's autoresearch uses Honest answer: Yes, this is LLM self-evaluation. Unlike
In practice this worked on our AI coding agents test where 6 experiments all scored 19-24/25 and were merged with real pricing data, feature matrices, and market analysis. Literature research testingNot yet. We have validated on Mode 1 (wine dataset ML optimization, 6 experiments) and Mode 2 (AI coding agents competitive analysis, 6 experiments). The system can handle literature research because the agent has browser and web tools, but PDF parsing, source deduplication, and academic citation formatting are untested. The rubric would need tweaking for academic work (e.g., evidence >= 4 minimum, require DOI/URL citations). Scope of changesEntirely contained within the skill. Zero changes to core Hermes:
If you delete Iteration limits vs. context windowThe git loop does not reduce token consumption per run — the agent might still burn through 50 iterations. The benefit is different:
|
a66f62d to
5b666ce
Compare
5b666ce to
dc22a92
Compare
Summary
Adds the autoresearch pattern as a native Hermes skill: autonomous background research using a branch → experiment → evaluate → merge/revert git loop. Zero external dependencies.
What It Does
git checkout -b exp_Nfrom mainArchitecture
Testing
How to Use
Launch from any Hermes Agent session:
Related