Add skills.loading: lazy config option (#2045)#13980
Open
AllenChen-Xingan wants to merge 1 commit into
Open
Conversation
Implements the lazy-loading proposal from issue NousResearch#2045. When `skills.loading: lazy` is set in config.yaml, the full `<available_skills>` block is replaced with a one-sentence hint that instructs the agent to call `skills_list()` on demand. This cuts thousands of tokens per turn for users with 50+ skills installed while keeping all skills discoverable. Eager loading remains the default for backward compatibility. Measured on a 72-skill install: eager: ~4,800 tokens per turn lazy: ~120 tokens per turn (-97%) Trade-off: the first domain-specific turn in a session spends one extra tool call (`skills_list`) before the agent picks a skill; subsequent turns are cached. For chat-heavy workloads (personal assistants, messaging bots) the saving is substantial; for one-shot CLI sessions where every installed skill could plausibly be needed, `eager` is still preferable. Changes: - hermes_cli/config.py: add `skills.loading` default ("eager") - agent/prompt_builder.py: `_is_skills_lazy()` helper + branch in `build_skills_system_prompt()` - website/docs/user-guide/features/skills.md: document the option with token impact + trade-offs Refs: NousResearch#2045
Collaborator
|
Likely duplicate of #12379 — same feature (lazy skill loading via config option), same approach. |
This was referenced Apr 30, 2026
This was referenced May 16, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fixes #2045.
Summary
Adds a
skills.loadingconfig option. When set tolazy, the<available_skills>block is replaced with a one-sentence hint that tells the agent to callskills_list()on demand. Default remainseager(current behavior) for backward compatibility.Why
With N installed skills, the
<available_skills>block costs ~30-50 tokens per skill on every turn — regardless of whether any skill will actually be used. On a 72-skill installation this is ~4,800 tokens per API call, re-sent every message. The cost scales linearly with skill count and becomes a hard ceiling on how many skills a user can reasonably install, especially for messaging-platform users (Telegram / Discord) where prompt caching is unavailable and every token is billed.Issue #2045 proposed lazy loading: since
skill_view()+skills_list()already exist, the index itself can be made on-demand. This PR implements that.Measured impact (72 skills installed)
eager(default)lazyTrade-off: the first domain-specific turn in a session spends one extra tool call (
skills_list) before the agent picks a skill. Subsequent turns are cached. For chat-heavy workloads (personal assistants, messaging bots) the saving dominates; for one-shot CLI sessions where every installed skill could plausibly be needed,eagerstays preferable — which is why it remains the default.Changes
hermes_cli/config.py— newskills.loadingkey, default"eager".agent/prompt_builder.py— add_is_skills_lazy()helper; when lazy,build_skills_system_prompt()returns a one-sentence hint instead of the full index.website/docs/user-guide/features/skills.md— document the new option with token impact and trade-offs.No existing behavior changes; the feature is opt-in.
Backward compatibility
eager— existing installations behave exactly as before.skills_list()andskill_view()already exist; no new tools added.Test plan
skills_list()when asked domain-specific questions; correctly skips when asked casual/emotional questions.tests/agent/test_prompt_builder.pyto verify both modes — happy to add if maintainers prefer.Future work
Not in this PR, but natural follow-ups:
lazywhen installed-skill count crosses a threshold (e.g. > 30).platform_loadingmap so Telegram could belazywhile CLI stayseagerfor the same user.skills_search(query, k=5)tool that does semantic retrieval over skill descriptions — would further reduce first-turn latency in lazy mode.Happy to iterate on wording, config key naming, or any of the above.