Skip to content

feat(skills): add comfyui-skill-openclaw — CLI-based ComfyUI image generation#13271

Open
HuangYuChuh wants to merge 2 commits into
NousResearch:mainfrom
HuangYuChuh:feat/add-comfyui-skill-openclaw
Open

feat(skills): add comfyui-skill-openclaw — CLI-based ComfyUI image generation#13271
HuangYuChuh wants to merge 2 commits into
NousResearch:mainfrom
HuangYuChuh:feat/add-comfyui-skill-openclaw

Conversation

@HuangYuChuh

Copy link
Copy Markdown
Contributor

Summary

Adds a new optional skill under creative/ that enables Hermes to orchestrate ComfyUI workflows for AI image generation via a single CLI tool.

What it does:

  • Run any ComfyUI workflow (Stable Diffusion, Flux, SDXL, etc.) from the terminal
  • Multi-server support — route jobs to different GPU machines
  • Dependency management — auto-detect and install missing custom nodes
  • Workflow import — bring in workflows from ComfyUI editor or API format
  • Full lifecycle: submit → poll → retrieve outputs

Prerequisites:

pip install -U comfyui-skill-cli

Source repo: https://github.com/HuangYuChuh/ComfyUI_Skills_OpenClaw

Why optional (not bundled)

Requires a running ComfyUI server instance and GPU hardware — not broadly applicable to all Hermes users. Perfect fit for optional-skills/creative/.

Checklist

  • SKILL.md follows agentskills.io frontmatter format
  • platforms declared (macOS, Linux, Windows)
  • prerequisites.commands specified
  • Category: creative
  • Tags for discoverability: image-generation, comfyui, ai-art, workflow, stable-diffusion, flux
  • Setup instructions included
  • Tested with Hermes Agent locally

Add a CLI-based skill that enables Hermes to run ComfyUI workflows
for image generation, supporting multi-server execution, dependency
management, and workflow import.

Requires: pip install -U comfyui-skill-cli
Source: https://github.com/HuangYuChuh/ComfyUI_Skills_OpenClaw
@alt-glitch alt-glitch added type/feature New feature or request P3 Low — cosmetic, nice to have comp/plugins Plugin system and bundled plugins tool/skills Skills system (list, view, manage) labels Apr 22, 2026
@alt-glitch

Copy link
Copy Markdown
Collaborator

Related to #11143 (ComfyUI MCP skill). This is a CLI-based alternative approach.

@alt-glitch

Copy link
Copy Markdown
Collaborator

Related to #11143

@HuangYuChuh

Copy link
Copy Markdown
Contributor Author

Thanks for linking #11143 @alt-glitch — took a close look at both approaches.

The core difference is where the complexity lives.

#11143 teaches the agent to hand-craft ComfyUI workflow JSON at runtime — the agent has to assemble node graphs (["4", 0] link syntax), manage polling loops, handle urllib multipart uploads, etc. It's essentially a 346-line tutorial embedded in SKILL.md. That works for simple txt2img, but real-world ComfyUI workflows have 20-80 nodes with complex routing — no LLM reliably generates those from scratch.

This PR takes the opposite approach: the user's existing workflows are the source of truth. The CLI handles import, schema extraction, parameter injection, dependency resolution, multi-server routing, and execution history. The agent's job is reduced to:

comfyui-skill list                           # what's available?
comfyui-skill run flux-portrait --args '{…}' # execute
comfyui-skill history list flux-portrait     # what happened?

A few concrete things this handles that #11143 doesn't:

  • Workflow discovery: list shows all imported workflows with param counts — the agent never sees raw node graphs
  • Dependency checking: deps check tells you which custom nodes or models are missing before you hit a cryptic class_type not found
  • Idempotency: --job-id prevents duplicate GPU burns on agent retries (O(1) file-existence check)
  • Multi-server: remote-a100/sdxl-base routes to a specific GPU box — no env var juggling

Also worth noting: #11143 triggered the supply-chain audit 3 times due to urllib.request.urlopen POST calls in SKILL.md. This PR has zero alerts — all network I/O lives in the installed CLI package, not in the skill definition.

Beyond the CLI, this project also ships a dedicated Web UI for visual workflow management — import, parameter editing, execution monitoring, and history browsing — making it accessible to non-technical users as well.

These two approaches serve different users. #11143 is great for someone who wants to learn the ComfyUI API. This one is for shipping — the agent calls a tool, gets an image back.

Happy to add a related_skills cross-reference if both land.

- SKILL.md: rewritten to match Hermes optional-skill standard (frontmatter,
  sections, uvx zero-install invocation, Web UI section, 10 pitfalls,
  verification checklist)
- references/cli-reference.md: complete 27-command reference
- references/api-notes.md: REST API endpoint map for debugging
- scripts/comfyui_setup.sh: workspace initialization with CLI auto-detection

Co-Authored-By: HuangYuChuh <HuangYuChuh@users.noreply.github.com>
@HuangYuChuh

Copy link
Copy Markdown
Contributor Author

Updated this PR to fully align with Hermes optional-skill conventions. The skill now includes:

  • SKILL.md rewritten to match the standard structure (frontmatter, When to Use, How It Works, Core Workflow, Pitfalls, Verification Checklist)
  • references/cli-reference.md — complete 27-command reference verified against the actual CLI help output
  • references/api-notes.md — REST API endpoint map for debugging
  • scripts/comfyui_setup.sh — workspace initialization with auto-detection of uvx vs pip-installed CLI

Noticed #17316 was opened with a similar scope. A few things worth noting:

  • This PR is maintained by the comfyui-skill-cli author — the command docs here are verified against the actual codebase, not reconstructed from README/help text
  • Correct attribution: author: HuangYuChuh, license: Apache-2.0 (the CLI is Apache-2.0, not MIT)
  • Includes a Web UI section for visual workflow management (import, parameter editing, execution history, i18n) — not covered in feat(skills): add comfyui skill — CLI-driven image/video/audio generation #17316
  • Supports both uvx (zero-install) and direct comfyui-skill (pip/pipx) invocation
  • Setup script auto-detects available CLI and accepts --url for custom server addresses

Happy to iterate on any feedback from maintainers.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/plugins Plugin system and bundled plugins P3 Low — cosmetic, nice to have tool/skills Skills system (list, view, manage) type/feature New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants