Add skills.loading: lazy config option (#2045) by AllenChen-Xingan · Pull Request #13980 · NousResearch/hermes-agent

AllenChen-Xingan · 2026-04-22T12:02:35Z

Fixes #2045.

Summary

Adds a skills.loading config option. When set to lazy, the <available_skills> block is replaced with a one-sentence hint that tells the agent to call skills_list() on demand. Default remains eager (current behavior) for backward compatibility.

Why

With N installed skills, the <available_skills> block costs ~30-50 tokens per skill on every turn — regardless of whether any skill will actually be used. On a 72-skill installation this is ~4,800 tokens per API call, re-sent every message. The cost scales linearly with skill count and becomes a hard ceiling on how many skills a user can reasonably install, especially for messaging-platform users (Telegram / Discord) where prompt caching is unavailable and every token is billed.

Issue #2045 proposed lazy loading: since skill_view() + skills_list() already exist, the index itself can be made on-demand. This PR implements that.

Measured impact (72 skills installed)

Mode	Skill-index tokens / turn
`eager` (default)	~4,800
`lazy`	~120
Savings	~97%

Trade-off: the first domain-specific turn in a session spends one extra tool call (skills_list) before the agent picks a skill. Subsequent turns are cached. For chat-heavy workloads (personal assistants, messaging bots) the saving dominates; for one-shot CLI sessions where every installed skill could plausibly be needed, eager stays preferable — which is why it remains the default.

Changes

hermes_cli/config.py — new skills.loading key, default "eager".
agent/prompt_builder.py — add _is_skills_lazy() helper; when lazy, build_skills_system_prompt() returns a one-sentence hint instead of the full index.
website/docs/user-guide/features/skills.md — document the new option with token impact and trade-offs.

No existing behavior changes; the feature is opt-in.

Backward compatibility

Default is eager — existing installations behave exactly as before.
skills_list() and skill_view() already exist; no new tools added.
LRU prompt cache still works — lazy mode has a single cache key.

Test plan

Syntax check on modified Python files.
Manual end-to-end on a 72-skill install: system prompt size confirmed to drop from ~4,800 to ~120 tokens.
Agent correctly invokes skills_list() when asked domain-specific questions; correctly skips when asked casual/emotional questions.
Add unit test under tests/agent/test_prompt_builder.py to verify both modes — happy to add if maintainers prefer.

Future work

Not in this PR, but natural follow-ups:

Auto-switch to lazy when installed-skill count crosses a threshold (e.g. > 30).
platform_loading map so Telegram could be lazy while CLI stays eager for the same user.
A skills_search(query, k=5) tool that does semantic retrieval over skill descriptions — would further reduce first-turn latency in lazy mode.

Happy to iterate on wording, config key naming, or any of the above.

Implements the lazy-loading proposal from issue NousResearch#2045. When `skills.loading: lazy` is set in config.yaml, the full `<available_skills>` block is replaced with a one-sentence hint that instructs the agent to call `skills_list()` on demand. This cuts thousands of tokens per turn for users with 50+ skills installed while keeping all skills discoverable. Eager loading remains the default for backward compatibility. Measured on a 72-skill install: eager: ~4,800 tokens per turn lazy: ~120 tokens per turn (-97%) Trade-off: the first domain-specific turn in a session spends one extra tool call (`skills_list`) before the agent picks a skill; subsequent turns are cached. For chat-heavy workloads (personal assistants, messaging bots) the saving is substantial; for one-shot CLI sessions where every installed skill could plausibly be needed, `eager` is still preferable. Changes: - hermes_cli/config.py: add `skills.loading` default ("eager") - agent/prompt_builder.py: `_is_skills_lazy()` helper + branch in `build_skills_system_prompt()` - website/docs/user-guide/features/skills.md: document the option with token impact + trade-offs Refs: NousResearch#2045

alt-glitch · 2026-04-22T12:08:24Z

Likely duplicate of #12379 — same feature (lazy skill loading via config option), same approach.

alt-glitch added type/feature New feature or request P3 Low — cosmetic, nice to have comp/agent Core agent loop, run_agent.py, prompt builder comp/cli CLI entry point, hermes_cli/, setup wizard tool/skills Skills system (list, view, manage) labels Apr 22, 2026

alt-glitch mentioned this pull request Apr 23, 2026

feat: Lazy skill loading — skills.loading config option #12379

Open

This was referenced Apr 30, 2026

feat: lazy skill loading with per-session deduplication #5938

Closed

feat(skills): add lazy load mode to reduce startup token usage #19174

Closed

This was referenced May 16, 2026

[Feature] Suppress <available_skills> system_prompt injection for scripted/oneshot use #26806

Open

feat: dedupe loaded skills via active skill context registry #29300

Open

alt-glitch mentioned this pull request May 24, 2026

feat(skills): lazy load + offload — names-only index, skill_peek, skill_unload #31542

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add skills.loading: lazy config option (#2045)#13980

Add skills.loading: lazy config option (#2045)#13980
AllenChen-Xingan wants to merge 1 commit into
NousResearch:mainfrom
AllenChen-Xingan:lazy-skill-loading

AllenChen-Xingan commented Apr 22, 2026

Uh oh!

alt-glitch commented Apr 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

AllenChen-Xingan commented Apr 22, 2026

Summary

Why

Measured impact (72 skills installed)

Changes

Backward compatibility

Test plan

Future work

Uh oh!

alt-glitch commented Apr 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants