feat(telegram): add sendMessageDraft transport for DM streaming#17179
Closed
hash7in wants to merge 141 commits into
Closed
feat(telegram): add sendMessageDraft transport for DM streaming#17179hash7in wants to merge 141 commits into
hash7in wants to merge 141 commits into
Conversation
Uses Telegram Bot API 9.3+ sendMessageDraft API for streaming responses in private chats (DMs). This provides: - No flood-control limits (unlike edit_message_text ~30/s cap) - Smoother client-side animations (diff-based updates) - Better UX for long-running responses Implementation: - telegram.py: Add send_draft_message() adapter method - stream_consumer.py: Add _send_or_edit_draft() path with __draft__ sentinel, draft_id generation, and proper segment break handling - run.py: Enable draft_transport for Telegram private chats (int chat_id) Draft content is sent as plain text (no MarkdownV2) to avoid parse failures on incomplete intermediate content. The final response is delivered separately with full Markdown formatting via the standard send path, which also dismisses the ephemeral draft on the client.
Collaborator
- StreamingConfig.transport: 'auto' (default), 'draft', 'edit', or 'off' - 'auto' uses draft for DMs if adapter supports it, otherwise edit - _resolve_draft_mode() decides based on config + adapter + chat_type - Pass chat_type via metadata so consumer can make transport decision - Add 10 tests: 7 unit + 3 end-to-end covering all transport modes
) * fix(tui): honor documented mouse_tracking config key The TUI runtime was reading display.tui_mouse while docs and user-facing examples pointed users at display.mouse_tracking. That made persistent mouse-disable config look like a no-op for users trying to restore native terminal selection/copy behavior on Linux/SSH/tmux terminals. Use display.mouse_tracking as the canonical key, keep display.tui_mouse as a legacy fallback, and have /mouse write the documented key. Both gateway config.get and client-side config sync now share the same precedence: the canonical key wins, then the legacy key, then default on. * review(copilot): align mouse tracking config coercion - Load gateway config once before deriving display.mouse_tracking state. - Use key-presence precedence on the TUI client too, so canonical mouse_tracking wins over legacy tui_mouse even when the value is null. - Treat numeric 0 as disabled on both gateway and client, matching the existing string "0" handling. - Widen ConfigDisplayConfig mouse fields because config.get full returns raw YAML, not normalized booleans.
Two targeted fixes on the critical path from `hermes --tui` launch to
`gateway.ready`:
1. **Defer `@hermes/ink` import in memoryMonitor.ts.** The static top-level
import dragged the full ~414KB Ink bundle (React + renderer + all
components/hooks) onto the critical path *before* `gw.start()` could
spawn the Python gateway — serialising ~155ms of Node work in front of
it on every launch. `evictInkCaches` only runs inside the 10-second
tick under heap pressure, so it moves to a lazy dynamic import. First
tick hits the ESM cache because the app entry has long since imported
`@hermes/ink`.
2. **Gate `tools.mcp_tool` import on config in tui_gateway/entry.py.**
Importing the module transitively pulls the MCP SDK + pydantic + httpx
+ jsonschema + starlette formparsers (~200ms). The overwhelming
majority of users have no `mcp_servers` configured, so this runs for
nothing. A cheap `load_config()` check (~25ms) skips the 200ms import
when no servers are declared, with a conservative fallback to the old
behaviour if the config probe itself fails.
## Measurements (macOS Terminal.app, Apple Silicon, n=12)
| Metric | Before (p50) | After (p50) | Δ |
|----------------------------|--------------|-------------|----------|
| Python gateway boot alone | 252–365ms | 105–151ms | −180ms |
| `hermes --tui` banner paint | 686ms | 665ms | −21ms |
| `hermes --tui` → ready | **1843ms** | **1655ms** | **−188ms (−10.2%)** |
| `hermes --tui` → ready p90 | 1932ms | 1778ms | −154ms |
| stdev (ready) | 126ms | 83ms | also more consistent |
## Tests
- `scripts/run_tests.sh tests/tui_gateway/ tests/tools/test_mcp_tool.py`:
195 passed. (The one pre-existing failure in
`test_session_resume_returns_hydrated_messages` reproduces on main —
unrelated, it's a mock-DB kwarg mismatch.)
- `ui-tui` vitest: 430 tests, all pass.
- `npm run type-check` in ui-tui: clean.
## Notes
- Node-side first paint ("banner") didn't move meaningfully because that
latency is dominated by Ink's render pipeline + React mount, not by
which imports load first.
- The win shows up entirely in the time from banner to `gateway.ready`
— exactly where we expected it, since both fixes shorten the Python
gateway's boot path or let it overlap more with Node startup.
- No user-visible behaviour change. Memory monitoring still fires every
10s; MCP still works when `mcp_servers` is configured.
NousResearch#17098) Two amplifying optimizations to per-turn overhead in the gateway: 1. get_tool_definitions() memoization (model_tools.py) Keyed on (frozenset(enabled), frozenset(disabled), registry._generation, config.yaml mtime+size). Only active when quiet_mode=True (which is every hot-path caller — gateway, AIAgent.__init__); quiet_mode=False keeps the existing print side effects. Cached path returns a shallow-copy list sharing read-only schema dicts. Measured: 7.5 ms → 0.01 ms per call (~750× speedup). Gateway constructs fresh AIAgent per message, so this saves ~7 ms/turn before any LLM work. 2. check_fn() TTL cache (tools/registry.py) check_fn callables like check_terminal_requirements probe external state (Docker daemon, Modal SDK, playwright binary). For a long-lived process, hitting them on every get_definitions() pass was pure waste — external state changes on human timescales. 30 s TTL so env-var flips (hermes tools enable X) propagate within a turn or two without explicit invalidation. Measured: first call 7.5ms → 1.6ms (check_fn probes now dominate); subsequent calls ~0.01ms via the upstream memoization. Invalidation surface: - registry._generation bumps on register/deregister/register_toolset_alias, invalidating the memoized definitions automatically. - config.yaml mtime in the cache key captures user-visible config edits affecting dynamic schemas (execute_code mode, discord allowlist). - invalidate_check_fn_cache() exposed for explicit flushes (e.g. after hermes tools enable/disable). - tests/conftest.py autouse fixture clears both caches before every test so env-var monkeypatches don't see stale results. Also fixes a regression from PR NousResearch#17046 that I missed: - tools/web_tools.py — Firecrawl was removed from module scope by the lazy import, breaking 8 tests that patch 'tools.web_tools.Firecrawl'. Applied the same _FirecrawlProxy pattern used in auxiliary_client/ run_agent for OpenAI (module-level proxy that looks like the class but imports the SDK on first call/isinstance; patch() replaces the attribute as usual). Verified: - 49/49 tests/tools/test_web_tools_config.py pass (was 8 failing on main) - 68/68 tests/tools/test_homeassistant_tool.py pass (was 1 failing in the full suite due to check_fn TTL cross-test pollution; fixed by the autouse fixture) - 3887/3895 tests/tools/ (8 pre-existing fails: 2 delegate, 1 mcp dynamic discovery, 5 mcp structured content — all confirmed on main) - 2973/2976 tests/agent/ + tests/run_agent/ (3 pre-existing fails) - 868/868 tests/run_agent/ (excluding test_run_agent.py which has pre-existing suite-level issues) - Live smoke: 2 turns + /model switch + tool calls, zero errors in agent.log session window. Co-authored-by: teknium1 <teknium@users.noreply.github.com>
Validate configured providers against both Hermes runtime provider ids and catalog-normalized provider ids. This keeps providers like ai-gateway from being rejected after catalog resolution maps them to models.dev ids. Keep credential checks and vendor-slug warnings anchored to the runtime id so doctor reports actionable provider names in follow-up diagnostics.
…ch#17202) Replace the removed built-in boot-md hook (NousResearch#17093) with a how-to that shows users how to wire up the same behavior themselves via the hooks system. Uses _resolve_gateway_model() + _resolve_runtime_agent_kwargs() so the example works against custom endpoints and OAuth providers, not just the aggregator defaults that the old built-in silently assumed. Co-authored-by: teknium1 <teknium@users.noreply.github.com>
…17203) Co-authored-by: teknium1 <teknium@users.noreply.github.com>
Address two Copilot review comments on PR NousResearch#17175. - `wrapForFrac` doc said "additive operators or whitespace" but the implementation also matches `*` and `/`. The wider behaviour is the one we want (nested products and fractions need parens to disambiguate inline `/`), so the doc is updated to match instead of tightening the regex. - `fenceOpenAt` was flagged as "overly conservative" vs. `markdown.tsx`, which falls back to paragraph rendering for unclosed `$$` openers. Mirroring that fallback in the streaming chunker would prematurely commit a paragraph rendering of the unclosed opener to the monotonic stable prefix, where it would be frozen and become wrong the moment the closer streams in. The asymmetry is deliberate; document why so it isn't "fixed" again later. Made-with: Cursor
…ousResearch#17206) detect_dangerous_command() and detect_hardline_command() were calling re.search(pattern, text, re.IGNORECASE | re.DOTALL) inline — Python's re._cache (512 patterns) amortizes compile cost on the warm path, but: 1. The first terminal() call per process pays the full compile fan-out for all 59 patterns (12 HARDLINE + 47 DANGEROUS). Measured at ~2.6 ms per detect_dangerous_command() call after re.purge(). 2. The re._cache is LRU — unrelated regex work elsewhere in the agent (response parsing, text normalization, etc.) can evict our patterns and silently re-compile them on the next terminal() call. Precompiling at module load eliminates both costs: detect_dangerous_command: cold 2.613 ms → 0.298 ms (-88%) warm 0.042 ms → 0.004 ms (-90%) detect_hardline_command: cold ~0.6 ms → 0.006 ms warm 0.011 ms → 0.002 ms Savings are per terminal() call. Agents with heavy terminal use see compound savings; the bigger value is the stability guarantee (no re._cache eviction can silently re-introduce the 2.6 ms cold cost mid-session). Implementation: - HARDLINE_PATTERNS_COMPILED and DANGEROUS_PATTERNS_COMPILED built at module load from the existing (pattern, description) tuples, using shared _RE_FLAGS = re.IGNORECASE | re.DOTALL. - detect_* functions now iterate the compiled list and call pattern_re.search(text). - Original HARDLINE_PATTERNS and DANGEROUS_PATTERNS lists kept as-is (other code in the file uses them for key derivation / _PATTERN_KEY_ALIASES). Verified: - 160/161 tests/tools/test_approval*.py pass (1 pre-existing heartbeat test flake on main). - 349/349 tests/tools/ 'approval or terminal or dangerous' pass. - Live hermes chat smoke: 3 benign terminal commands + 1 rm -rf /tmp/ (clarify prompt fired — approval path still works) + 1 sudo (sudo password prompt fired — DANGEROUS pattern match still works). 23 log lines in the smoke window, zero errors. Co-authored-by: teknium1 <teknium@users.noreply.github.com>
…message directly When stream finishes in draft mode (got_done + _draft_mode), the consumer was sending a final draft update via _send_or_edit() before the real sendMessage — causing the Telegram client to animate the draft content first, then switch to the real message, resulting in a visible delay where the draft appears to linger after the real message arrives. Fix: 1. Add branch to skip the redundant final draft update before the got_done handler runs. 2. Change the in the got_done + draft_mode block to actually send the real message via _send_new_chunk() and mark _final_response_sent = True so the base gateway path doesn't attempt a duplicate send. Test updates: - test_draft_transport_used_for_dm: final_response_sent is now True (consumer sends real message directly instead of deferring to gateway) - test_draft_failure_falls_back_to_edit: same — real message is always sent on stream end to dismiss the draft
…onflict
- test_draft_transport_used_for_dm: fix assertion on sent messages,
add proper counting of real messages (only 1 on stream end)
- test_draft_failure_skips_update_stays_in_draft_mode: verify
consumer stays in draft mode on failure, add sent count assertion
- Remove test_draft_no_intermediate_real_messages: it assumed the
first draft would be 'ab' but _MIN_NEW_MSG_CHARS=4 causes the
consumer to skip drafts shorter than 4 chars on first send
- Replace batched (push-all-then-run) test pattern with _run_streaming
helper that progressively feeds deltas during run(), matching real
streaming behavior where tokens arrive incrementally
- Fix StubAdapter.send_draft_message to strip cursor like the real
Telegram adapter (endswith(' ▉') → strip last 2 chars)
Made-with: Cursor # Conflicts: # ui-tui/src/components/markdown.tsx
When streaming ends (got_done) in draft mode, send one final draft update with the COMPLETE accumulated text but WITHOUT cursor before sending the real message. Previously we skipped the final draft entirely, which caused a 4-5 second delay where the client's typing animation from the previous draft cursor was still in progress when the real message arrived: the client couldn't cleanly interrupt the animation and the last few characters appeared 'stuck'. By sending the full text without cursor first: 1. Client receives complete content with no typing animation 2. Client renders the full text instantly in the draft box 3. Subsequent real message dismisses the draft cleanly without race
Prevent unterminated bracketed paste input from swallowing future keystrokes, and avoid rendering an empty Thinking panel before reasoning arrives.
Keep the latest prompt sticky while the viewport is in live assistant output beyond history, and clear stale sticky state at the real bottom using fresh scroll height.
Keep the /steer acknowledgement plain text so it reads like the rest of the TUI status copy.
Run the TUI lint autofix and formatter on the PR branch after the sticky prompt and paste recovery changes.
Match the buffered-stdin rearm cadence to IN_PASTE state so large pastes do not spin the normal escape timeout while waiting for readable data to drain.
…watchdog fix(tui): stabilize sticky prompts and paste recovery
Add MiniMax OAuth (minimax-oauth) as a first-class provider using a
PKCE device-code flow ported from openclaw/extensions/minimax/oauth.ts.
Changes:
- hermes_cli/auth.py:
- Add 8 MINIMAX_OAUTH_* constants (client ID, scope, grant type,
global/CN base URLs, inference URLs, refresh skew)
- Add 'minimax-oauth' ProviderConfig to PROVIDER_REGISTRY (auth_type
oauth_minimax) with global portal + inference base URLs and CN
extras in the extra dict
- Add provider aliases: minimax-portal, minimax-global, minimax_oauth
- Implement _minimax_pkce_pair(), _minimax_request_user_code(),
_minimax_poll_token(), _minimax_save_auth_state(),
_minimax_oauth_login(), _refresh_minimax_oauth_state(),
resolve_minimax_oauth_runtime_credentials(),
get_minimax_oauth_auth_status(), _login_minimax_oauth()
- Token refresh uses standard OAuth2 refresh_token grant; triggers
relogin_required on invalid_grant / refresh_token_reused
- hermes_cli/runtime_provider.py:
- Add minimax-oauth branch (after qwen-oauth) that calls
resolve_minimax_oauth_runtime_credentials() and returns
api_mode='anthropic_messages' with the OAuth Bearer token
- hermes_cli/auth_commands.py:
- Add 'minimax-oauth' to _OAUTH_CAPABLE_PROVIDERS
- Add auth_type auto-detection for oauth_minimax
- Add provider == 'minimax-oauth' branch in auth_add_command
- hermes_cli/doctor.py:
- Import get_minimax_oauth_auth_status
- Add MiniMax OAuth status check in the Auth Providers section
Wire MiniMax-M2.7 and MiniMax-M2.7-highspeed into the model catalog,
CLI model picker, and agent auxiliary/metadata subsystems.
Changes:
- hermes_cli/models.py:
- Add 'minimax-oauth' to _PROVIDER_MODELS with MiniMax-M2.7 and
MiniMax-M2.7-highspeed
- Add ProviderEntry('minimax-oauth', 'MiniMax (OAuth)', ...) to
CANONICAL_PROVIDERS near existing minimax entries
- Add aliases: minimax-portal, minimax-global, minimax_oauth in
_PROVIDER_ALIASES
- hermes_cli/main.py:
- Add 'minimax-oauth' to provider_labels dict
- Insert 'minimax-oauth' into providers list in
select_provider_and_model() near the other minimax entries
- Add 'minimax-oauth' to --provider argparse choices
- Add _model_flow_minimax_oauth() function: ensures login via
_login_minimax_oauth(), resolves runtime credentials, prompts for
model selection, saves model choice and config
- Add dispatch elif branch for selected_provider == 'minimax-oauth'
- agent/auxiliary_client.py:
- Add 'minimax-oauth': 'MiniMax-M2.7-highspeed' to
_API_KEY_PROVIDER_AUX_MODELS
- Add 'minimax-oauth' to _ANTHROPIC_COMPAT_PROVIDERS set
- agent/model_metadata.py:
- Add 'minimax-oauth' to _PROVIDER_PREFIXES frozenset
- MiniMax-M2.7 context length (200_000) already covered by the
existing 'minimax' substring match in DEFAULT_CONTEXT_LENGTHS
Add and extend tests for the minimax-oauth provider across three test
modules.
New file: tests/test_minimax_oauth.py (15 tests)
- test_pkce_pair_produces_valid_s256: verifies PKCE verifier/challenge
pair produces a valid S256 hash and correct lengths
- test_request_user_code_happy_path: mocks httpx, verifies correct
POST parameters and response parsing
- test_request_user_code_state_mismatch_raises: verifies CSRF guard
- test_request_user_code_non_200_raises: verifies HTTP error handling
- test_poll_token_pending_then_success: verifies polling loop retries
on 'pending' and returns on 'success'
- test_poll_token_error_raises: verifies 'error' status raises AuthError
- test_poll_token_timeout_raises: verifies deadline expiry raises
- test_refresh_skip_when_not_expired: verifies no HTTP call when token
is fresh
- test_refresh_updates_access_token: verifies new access/refresh tokens
stored on successful refresh
- test_refresh_reuse_triggers_relogin_required: verifies
relogin_required=True on invalid_grant/refresh_token_reused
- test_resolve_credentials_requires_login: verifies AuthError when no
stored state
- test_provider_registry_contains_minimax_oauth: PROVIDER_REGISTRY key
- test_minimax_oauth_alias_resolves: portal/global/underscore aliases
- test_get_minimax_oauth_auth_status_not_logged_in
- test_get_minimax_oauth_auth_status_logged_in
Extended: tests/hermes_cli/test_runtime_provider_resolution.py
- test_minimax_oauth_runtime_returns_anthropic_messages_mode
- test_minimax_oauth_runtime_uses_inference_base_url
Extended: tests/hermes_cli/test_api_key_providers.py
- TestMinimaxOAuthProvider class (8 tests) covering registry keys,
auth_type, endpoints, client_id, aliases, CANONICAL_PROVIDERS
listing, _PROVIDER_MODELS entries, and aux model
Add comprehensive documentation for the minimax-oauth provider.
New file: website/docs/guides/minimax-oauth.md
- Overview table (provider ID, auth type, models, endpoints)
- Quick start via 'hermes model'
- Manual login via 'hermes auth add minimax-oauth'
- --region global|cn flag reference
- The PKCE OAuth flow explained step-by-step
- hermes doctor output example
- Configuration reference (config.yaml shape, region table, aliases)
- Environment variables note: MINIMAX_API_KEY is NOT used by
minimax-oauth (OAuth path uses browser login)
- Models table with context length note
- Troubleshooting section: expired token, timeout, state mismatch,
headless/remote sessions, not logged in
- Logout command
Updated: website/docs/getting-started/quickstart.md
- Add MiniMax (OAuth) to provider picker table as the recommended
path for users who want MiniMax models without an API key
Updated: website/docs/user-guide/configuration.md
- Add 'minimax-oauth' to the auxiliary providers list
- Add MiniMax OAuth tip callout in the providers section
- Add minimax-oauth row to the provider table (auxiliary tasks)
- Add MiniMax OAuth config.yaml example in Common Setups
Updated: website/docs/reference/environment-variables.md
- Annotate MINIMAX_API_KEY, MINIMAX_BASE_URL, MINIMAX_CN_API_KEY,
MINIMAX_CN_BASE_URL as NOT used by minimax-oauth
- Add minimax-oauth to HERMES_INFERENCE_PROVIDER allowed values
Close integration gaps discovered by auditing qwen-oauth's file coverage. These are surfaces the original salvage missed — they all existed on main and were added in the 747 commits since PR NousResearch#15203 was opened. Coverage added: - agent/credential_pool.py: seed pool from auth.json providers.minimax-oauth so `hermes auth list` reflects logged-in state and `hermes auth remove minimax-oauth <N>` works through the standard flow. - agent/credential_sources.py: register RemovalStep for minimax-oauth with suppression-aware `_clear_auth_store_provider`. - agent/models_dev.py: PROVIDER_TO_MODELS_DEV mapping (-> 'minimax' family). - hermes_cli/providers.py: HermesOverlay entry (anthropic_messages transport, oauth_external auth_type, api.minimax.io/anthropic base). - hermes_cli/model_normalize.py: add to _MATCHING_PREFIX_STRIP_PROVIDERS so `minimax-oauth/MiniMax-M2.7` in config.yaml gets correctly repaired. - hermes_cli/status.py: render MiniMax OAuth block in `hermes doctor` (logged-in / region / expires_at / error). - hermes_cli/web_server.py: register in OAUTH_PROVIDER_REGISTRY + dispatch branch in _resolve_provider_status so the dashboard auth page shows it. - website/docs/integrations/providers.md: full 'MiniMax (OAuth)' section. - website/docs/reference/cli-commands.md: --provider enum. - website/docs/user-guide/features/fallback-providers.md: fallback table row. - scripts/release.py AUTHOR_MAP: amanning3390 mapping (CI gate).
feat(latex): latex in tui
…ousResearch#17563) Skill catalog pages (bundled/optional) were drowning out real user-guide and reference docs in search results. There are ~3100 of them and they match on almost every generic term. - Add `ignoreFiles` regexes to docusaurus-search-local for `user-guide/skills/bundled/` and `user-guide/skills/optional/`. The two human-written catalog indexes (`reference/skills-catalog`, `reference/optional-skills-catalog`) remain indexed. - Add a new feature page `user-guide/features/curator.md` covering the curator subsystem merged in NousResearch#16049 and refined in NousResearch#17307 (per-run reports): how it runs, config, CLI (`hermes curator status/run/pin/ restore/...`), `.usage.json` telemetry, archival semantics, and recovery. Slotted into the Core features sidebar next to Skills. Search index size dropped from 5822 docs to 2704 in the main section; `user-guide/features/curator` is indexed.
…ch#17562) Extend curator's pin flag from 'skip auto-transitions' to 'no agent edits at all'. All five skill_manage mutation actions (edit, patch, delete, write_file, remove_file) now refuse pinned skills with a message pointing the user at `hermes curator unpin <name>`. Motivation: pin used to only stop the curator's own maintenance pass from touching a skill. Nothing prevented the main agent from editing or deleting a pinned skill via skill_manage in-session. This gives users a hard fence against unwanted agent edits — same semantics as curator pinning, extended to the write tool. Create is unaffected (you can't pin a name that doesn't exist yet, and name collisions already error out). Broken sidecars fail open rather than lock the agent out. The schema description advertises the new refusal so models know not to route around it with rename/recreate tricks.
…ousResearch#17578) Add a dedicated 'Pinning a skill' section that covers both gating layers — curator auto-transitions AND the agent's skill_manage tool — so users know what the flag actually protects against after PR NousResearch#17562. Updates the one-line claim in 'How it runs' to cross-link the new section instead of only mentioning auto-transitions.
Cron is a built-in Hermes feature (CLI `hermes cron`, `cronjob` agent tool, gateway ticker, scheduler in cron/scheduler.py) but croniter has been gated behind the [cron] optional extra. Users who do a plain `pip install hermes-agent` can create jobs via /cron but any recurring cron schedule silently returns next_run_at=None (HAS_CRONITER=False), which then gets wrapped into a 'state=error' message only after a tick. Move croniter into core dependencies so scheduled jobs work out of the box on any install path. The [cron] extra is kept as an empty passthrough so existing `pip install hermes-agent[cron]` installs and the [all]/[termux] extras continue to resolve. Also update the now-stale user-facing error message in `compute_next_run()` that still tells users to install `hermes-agent[cron]`. Salvaged from NousResearch#17234 (authored by @txbxxx) with a corrected premise: the original PR claimed [cron] wasn't in [all], but it is (pyproject.toml line 112). The real UX problem is the plain no-extras install path, which this fix addresses.
Capture the reusable layout and animation lessons from the advanced Pretext demo so the skill teaches measured obstacle fields, morphing geometry, and polished browser examples.
…y dependency Complete rewrite of the ComfyUI skill to use: - comfy-cli (official, Comfy-Org/comfy-cli) for lifecycle management: install, launch, stop, node management, model downloads - Direct REST API + helper scripts for workflow execution: parameter injection, submission, monitoring, output download - No dependency on comfyui-skill-cli or any unofficial tool New files: - SKILL.md: full rewrite with two-layer architecture, decision tree, pitfalls - references/official-cli.md: complete comfy-cli command reference - references/rest-api.md: all REST endpoints (local + cloud) - references/workflow-format.md: API format spec, common nodes, param mapping - scripts/extract_schema.py: analyze workflow → extract controllable params - scripts/run_workflow.py: inject args, submit, poll, download outputs - scripts/check_deps.py: check missing nodes/models against running server - scripts/comfyui_setup.sh: full setup automation with official CLI Removed: - references/cli-reference.md (was for unofficial comfyui-skill-cli) - references/api-notes.md (replaced by rest-api.md) Addresses feedback from PR NousResearch#17316 comment: - Correct author attribution - Remove references to unofficial OpenClaw project - License field reflects hermes-agent repo (MIT)
…links, cloud setup Adds structured onboarding flow to SKILL.md: - Decision table: which install path for which situation - Path A: Comfy Cloud (zero setup, API key, pricing) - Path B: Desktop app (Windows/macOS, one-click) - Path C: Portable build (Windows, extract-and-run) - Path D: comfy-cli (recommended for agents, all platforms) - Path E: Manual install (advanced, all hardware types) - Post-install: model downloads, custom nodes, verification All paths link to official docs: - https://docs.comfy.org/installation - https://docs.comfy.org/comfy-cli/getting-started - https://docs.comfy.org/get_started/cloud - https://docs.comfy.org/installation/desktop - https://docs.comfy.org/installation/comfyui_portable_windows - https://docs.comfy.org/installation/manual_install
…ring sanitization' Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>
…ring sanitization' Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>
Layers a programmatic hardware-feasibility check on top of the v4 skill
so the agent doesn't silently push users toward a local install they
can't actually run. The official comfy-cli supports --nvidia / --amd /
--m-series / --cpu, but has no guard against "4 GB laptop GPU on SDXL"
or "Intel Mac falling back to CPU" — both route to comfy-cli paths in
the original table and then fail on first workflow.
- scripts/hardware_check.py: detect OS/arch/GPU (NVIDIA nvidia-smi,
AMD rocm-smi, Apple M1+ via arm64+sysctl, Intel Arc via clinfo),
VRAM, system/unified RAM. Emits JSON
{verdict: ok|marginal|cloud, recommended_install_path, comfy_cli_flag}
with practical thresholds: discrete GPU >=6 GB VRAM minimum,
Apple Silicon >=16 GB unified memory minimum, Intel Mac -> cloud,
no accelerator -> cloud. comfy_cli_flag maps directly to
`comfy install` so the agent can stitch the whole flow together.
- scripts/comfyui_setup.sh: runs hardware_check.py first when no
explicit flag is passed. If verdict=cloud, refuses to install
locally, prints Comfy Cloud URL + an override command, exits 2.
Otherwise auto-selects the right --nvidia/--amd/--m-series flag
for `comfy install`. Surfaces marginal-verdict notes to the user.
- SKILL.md Setup & Onboarding: adds mandatory Step 0 "Check If This
Machine Can Run ComfyUI Locally" ahead of the Path A-E selection.
Documents the verdict thresholds inline, ties verdict + comfy_cli_flag
to the install paths, and updates the path-choice table so
"verdict: cloud" is the first row. Quick-Start "Detect Environment"
block extended to include the hardware check. Verification
checklist gains a hardware-check gate.
- Frontmatter setup.help rewritten to point at hardware_check.py
first. Version bumped 4.0.0 -> 4.1.0.
…search#17612) Adds Step 0 'Ask Local vs Cloud' as the very first onboarding step, with a scripted question that spells out the hardware requirements for local (6 GB VRAM NVIDIA, ROCm AMD on Linux, or M1+ Mac with 16 GB unified) and routes Cloud users straight to Path A without a hardware check. Hardware check becomes Step 1, run only when the user picked local.
…NousResearch#17611) The skip_pre_tool_call_hook flag was added to prevent double-firing of pre_tool_call when run_agent._invoke_tool pre-checks for a block directive and then dispatches via handle_function_call. But the implementation added an else: branch that fired invoke_hook again for 'observers', without noticing that get_pre_tool_call_block_message() in hermes_cli.plugins already fires invoke_hook('pre_tool_call', ...) as part of its block-directive poll. Result: every tool call ran through the run_agent loop fired the hook twice — reported by community users whose observer / audit plugins logged each tool invocation twice with identical timestamps. Fix: delete the else: branch. The single-fire contract is now: - skip=False (direct handle_function_call): hook fires once inside get_pre_tool_call_block_message(). - skip=True (run_agent._invoke_tool path): caller fires the hook once via get_pre_tool_call_block_message(); handle_function_call must not fire it again. Tightened the existing skip-flag test (renamed to test_skip_flag_prevents_double_fire) to assert pre_tool_call fires zero times when skip=True, and added test_run_agent_pattern_fires_pre_tool_call_exactly_once to lock in end-to-end that the full block-check + dispatch sequence fires the hook exactly once.
skills: add pretext (creative demos with @chenglou/pretext)
) Intended placement per PR NousResearch#17610 discussion — comfyui belongs in skills/creative/ alongside other creative built-ins (touchdesigner-mcp, pretext, sketch), not in optional-skills/. Pure directory rename, no content changes. History preserved via git mv.
Resolve conflicts: - gateway/run.py: keep both _stream_metadata (ours) + on_new_message (upstream) - gateway/stream_consumer.py: keep both _resolve_draft_mode (ours) + _notify_new_message (upstream)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds
sendMessageDrafttransport for Telegram DM streaming, with a critical fix for draft-vanish on tool calls and fast draft dismissal on stream end.Motivation
Current streaming uses
edit_message_textwhich has flood-control limits and causes visible flicker. Telegram Bot API 9.3+ exposes a native draft API that renders progressively without rate-limit pressure.Key advantages
1. Draft-vanish fix on tool calls
When the LLM calls a tool mid-stream (e.g. web_search → answer), the ephemeral draft vanishes on the Telegram client because any real message posted by the bot dismisses it. This causes a jarring UX: partial text appears, disappears when the tool runs, then reappears.
This PR handles this at the segment-break boundary:
Without this fix, DM users see text disappear on every tool call — a common pattern for search-and-answer workflows.
2. No "draft lingers" delay on stream end
When the stream finishes, the consumer now:
sendMessageDraftupdate (avoids forcing the client to animate draft content right before the real message)got_doneblock instead of deferring to the gateway pathThis eliminates the visible delay where the draft appears to "hang" after the real message arrives.
Changes
gateway/platforms/telegram.py(+53 lines)send_draft_message()adapter methodgateway/stream_consumer.py(+128/-8 lines)transportconfig field onStreamConsumerConfig("auto","draft","edit","off")_resolve_draft_mode()method — resolves transport from config + adapter capability + chat_type_send_or_edit_draft()method as draft transport path__draft__sentinel for_message_id(draft has no real message_id)got_donehandling — skips redundant final draft update, sends real message directly__no_edit__guards extended to include__draft__gateway/run.py(+20 lines)transport="auto"for Telegram private chatschat_typeas metadata so the consumer can resolve transport modeBehavior
Test plan
Tests cover: