fix(skills/comfyui): bug fixes, cloud parity, expanded coverage, examples, tests#17682
Closed
SHL0MS wants to merge 2 commits into
Closed
fix(skills/comfyui): bug fixes, cloud parity, expanded coverage, examples, tests#17682SHL0MS wants to merge 2 commits into
SHL0MS wants to merge 2 commits into
Conversation
…ples, tests
The audit of v4.1 surfaced ~70 issues across the five scripts and three
reference docs — most user-visible (silent file overwrites, status-error
misclassified as success, X-API-Key leaked to S3 on /api/view redirect,
Cloud endpoints that 404 because they were renamed). v5.0.0 fixes those
and fills the gaps that previously forced users to write their own glue
(WebSocket monitoring, batch/sweep, img2img upload helper, dep auto-fix,
log fetch, health check, example workflows).
Critical fixes
- run_workflow.py: poll_status now checks status_str==error BEFORE
completed:true, so a failed run no longer reports success
- run_workflow.py: download_output streams to disk via safe_path_join,
preserves server subfolder structure (no silent overwrites), and
retries with exponential backoff
- run_workflow.py: refuses to overwrite a link with a literal in
inject_params (would silently break wiring)
- _common.py: _StripSensitiveOnRedirectSession (subclasses
requests.Session.rebuild_auth) drops X-API-Key/Cookie on cross-host
redirects — fixes a real key-leak path through Cloud's signed-URL
download flow. Tested
- Cloud routing (verified live): /history → /history_v2,
/models/<f> → /experiment/models/<f>, plus folder aliases for the
unet ↔ diffusion_models and clip ↔ text_encoders rename
- check_deps.py: distinguishes 200/empty vs 404 folder_not_found vs
403 free-tier; emits concrete fix_command per missing dep
- extract_schema.py: prompt vs negative_prompt determined by tracing
KSampler.{positive,negative} connections (incl. through Reroute /
Primitive nodes) instead of meta-title heuristic; symmetric
duplicate-name resolution; cycle-safe trace_to_node
- hardware_check.py: multi-GPU pick-best, Apple variant detection,
Rosetta detection, WSL2, ROCm --json, disk-space check, optional
PyTorch probe; powershell preferred over deprecated wmic
- comfyui_setup.sh: prefers pipx → uvx → pip --user (with PEP-668
fallback); idempotent — skips relaunch if server already up;
configurable port/workspace; persistent log; SIGINT trap
New scripts
- run_batch.py — count or sweep (cartesian product), parallel up to
cloud tier limit
- ws_monitor.py — real-time WebSocket viewer; saves preview frames
- auto_fix_deps.py — runs comfy node install / model download for
whatever check_deps reports missing (with --dry-run)
- health_check.py — single command that runs the verification checklist
(comfy-cli + server + checkpoints + optional smoke test that cancels
itself to avoid burning compute)
- fetch_logs.py — pull traceback / status messages for a prompt_id
Coverage expansion
- Param patterns now cover Flux (BasicScheduler, BasicGuider,
RandomNoise, ModelSamplingFlux), SD3, Wan/Hunyuan/LTX video,
IPAdapter, rgthree, easy-use, AnimateDiff
- Embedding refs in CLIPTextEncode strings extracted as model deps
- ckpt_name / vae_name / lora_name / unet_name now controllable so
workflows can be retargeted per run
Examples
- workflows/{sd15,sdxl,flux_dev}_txt2img.json
- workflows/sdxl_{img2img,inpaint}.json
- workflows/upscale_4x.json
- workflows/{animatediff_video,wan_video_t2v}.json + README
Tests
- 117 tests (105 unit + 8 cloud integration + 4 cross-host security)
- Cloud tests auto-skip without COMFY_CLOUD_API_KEY; verified end-to-end
against live cloud API
Backwards compatibility
- All existing CLI flags continue to work; new behavior is opt-in
(--ws, --input-image, --randomize-seed, --flat-output, etc.)
Collaborator
|
@christian-byrne Tagging you on this ComfyUI item. |
1 similar comment
Collaborator
|
@christian-byrne Tagging you on this ComfyUI item. |
Self-review caught several errors in the previous commit: Frontmatter - Replace non-standard `requires_runtime` / `requires_tooling` fields with the documented `compatibility:` field (parsed by tools/skills_tool.py). - Drop the `audit-v5` author tag I added unnecessarily. MODEL_LOADERS catalog - Remove `IPAdapterUnifiedLoader` (input `preset` is an enum, not a file). - Remove `IPAdapterInsightFaceLoader` and `InsightFaceLoader` (input `provider` is a GPU backend selector, not a model file). These would have flagged enum values like "STANDARD" or "CUDA" as missing model files. - Add "NB:" comment explaining `BasicGuider` has no `cfg` input (the original PARAM_PATTERNS entry would never have matched). - Remove `SamplerCustomAdvanced.noise_seed` from PARAM_PATTERNS — that node takes a NOISE input from RandomNoise, not a seed field directly. NODE_TO_PACKAGE registry slugs - Verified all 18 packages against api.comfy.org and fixed: - `comfyui-essentials` → `comfyui_essentials` (underscore, not hyphen) - `comfyui-gguf` → `ComfyUI-GGUF` (case-sensitive) - `comfyui-photomaker-plus` → `ComfyUI-PhotoMaker-Plus` - `comfyui-wanvideowrapper` → `ComfyUI-WanVideoWrapper` - ComfyUI-HunyuanVideoWrapper isn't on the registry; surface a git-URL install hint via new NODE_TO_GIT_URL fallback so the user can install via ComfyUI-Manager's /manager/queue/install endpoint. Wrong class names - `Canny` → `CannyEdgePreprocessor` (controlnet-aux registers the latter, the former never appears in /object_info). - Add `Zoe_DepthAnythingPreprocessor` and `AnimalPosePreprocessor` while fixing controlnet-aux. - Remove `Reroute (rgthree)` (rgthree's Reroute is JS-only — no Python class, never appears in /object_info). - Add `Display Int (rgthree)` (sibling of Display Any). - Move `UltralyticsDetectorProvider` from `comfyui-impact-pack` to `comfyui-impact-subpack` (separate package, registered there). Tests - Update test_packages_are_safe_for_shell to accept case-mixed slugs (the registry uses both ComfyUI- and comfyui_ prefixes inconsistently). Replaced the lowercase-only assertion with a shell-safe regex check. - 117 tests still pass (105 unit + 8 cloud + 4 cross-host). Attribution - Add `SHL0MS@users.noreply.github.com` mapping to scripts/release.py AUTHOR_MAP so check-attribution CI passes.
Contributor
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
A comprehensive audit of the v4.1 ComfyUI skill surfaced ~70 issues across the
five scripts and three reference docs. v5.0.0 fixes them and fills the gaps
that previously forced users to write their own glue around the skill.
This is a large PR — apologies. Most of the issues are coupled (a single
_common.pyfor HTTP/cloud-routing/catalogs eliminates duplication, but itripples through every script that imported them). All changes live in
skills/creative/comfyui/; nothing else is touched.Every fix is verified end-to-end against the live Comfy Cloud API (via a
free-tier key) — the cloud endpoint differences turned out to be the biggest
hidden bug class in the original.
Why now
The v4.1 skill works for happy-path local SD1.5 / SDXL submission, but breaks
in real production scenarios:
directly to
output_dir / filename. Two outputs with the samefilenameindifferent
subfolders collided. A malicious workflow could../../etc/....entry["status"]["completed"]isTrueevenon errored runs; the original checked
completedfirst, so failed runs werereported as
success./api/view: cloud returns 302 → signed S3 URL.The original session hook tried to strip
X-API-Keyon cross-host redirect,but
requestsdoesn't re-read session headers when building a redirectrequest, so the key was still being sent to the storage backend.
/api/object_info,/api/queue,/api/upload,/api/prompt,/api/vieware all paid-only on free tier (return 403)./api/historywas renamed to/api/history_v2./api/models/<f>was renamedto
/api/experiment/models/<f>.unet/andclip/folders no longer existon cloud — moved to
diffusion_models/andtext_encoders/. The originalskill's check_deps reported every Flux model as missing on cloud as a result.
of a duplicated friendly name kept the bare name, second got
_node_idsuffix, third also got
_node_id(potentially colliding). Now symmetric.embedding:goodvibes:1.2) were neverdetected as model dependencies — workflows would silently fail with
"embedding not found" at runtime.
Critical fixes
poll_statusreports failed runs as successstatus_str==\"error\"BEFOREcompleted:truedownload_outputpath traversalsafe_path_join()containment guard--flat-outputopt-out); auto-suffix_1/_2on collisionX-API-Keyleak to S3 on redirect_StripSensitiveOnRedirectSessionsubclassesSession.rebuild_authto drop X-API-Key/Authorization/Cookie on cross-host hop. Tested.is_cloud_host()in_common.py(handles*.comfy.orgsubdomains)resolve_url()maps/history → /history_v2,/models/<f> → /experiment/models/<f>unet → diffusion_modelsFOLDER_ALIASESconsulted bycheck_depsso Flux deps resolve on cloudcode: folder_not_foundbody markerextra_data.api_key_comfy_orgpolluted with auth key--partner-keyexplicitly passedNone; Rosetta detection addedpip install comfy-clipollutes system Pythonunwrap_workflow()handles{\"prompt\": {...}}shapeNew scripts
_common.pyrun_batch.py--count N --randomize-seedor--sweep '{...}'; parallel up to tier limitws_monitor.pyauto_fix_deps.pycomfy node install/comfy model downloadfor whatever check_deps reports missing (with--dry-run)health_check.py--smoke-testsubmits a tiny workflow then cancels (no compute waste)fetch_logs.pyCoverage expansion
SD3, Wan / Hunyuan / LTX video, IPAdapter, rgthree, easy-use, AnimateDiff
ckpt_name,vae_name,lora_name,unet_name, etc.)now controllable so workflows can be retargeted per run via
--argsExamples
workflows/:sd15_txt2img.json,sdxl_txt2img.json,flux_dev_txt2img.json,sdxl_img2img.json,sdxl_inpaint.json,upscale_4x.json,animatediff_video.json,wan_video_t2v.json+ a README.Tests
117 passed in 2.19s— 105 unit + 8 cloud integration + 4 cross-host security.Cloud integration tests auto-skip without
COMFY_CLOUD_API_KEY. They were runagainst the live Comfy Cloud API during development, including verification
that the rename map and folder aliases produce 0 missing deps for the example
workflows that target cloud-resident models.
Backwards compatibility
All existing CLI flags and behavior preserved. New behavior is opt-in:
--ws,--input-image,--randomize-seed,--flat-output,--partner-key,--api-key(env var fallback),--submit-only. Any script invocation thatworked under v4.1 continues to work unchanged.
What I deliberately deferred
/object_info-driven runtime param validation (would require object_info onevery run; cloud free-tier returns 403)
extra_model_paths.yamlawareness (server-side concern; check_deps trustswhat the server enumerates)
Happy to split into smaller commits, drop any of the new scripts, or scope
back if any of this is too much. The bug fixes in
run_workflow.pyand_common.py(path traversal, status misclassification, X-API-Key leak,streaming downloads, cloud routing) are the safety-critical core if you want
to take only those.