Skip to content

feat(image-gen): add ModelScope backend for image generation#83

Merged
hugohe3 merged 5 commits into
hugohe3:mainfrom
Ahusm:main
May 2, 2026
Merged

feat(image-gen): add ModelScope backend for image generation#83
hugohe3 merged 5 commits into
hugohe3:mainfrom
Ahusm:main

Conversation

@Ahusm

@Ahusm Ahusm commented May 2, 2026

Copy link
Copy Markdown
Contributor

feat(image-gen): add ModelScope backend for image generation

Introduces a new image generation backend using ModelScope (魔塔社区)

API with support for Z-Image-Turbo and other models.

Key features:

Async task-based generation workflow with polling

Multiple resolution presets (512px, 1K, 2K, 4K)

Aspect ratio support (1:1, 3:4, 4:3, 9:16, 16:9)

Configurable via MODELSCOPE_API_KEY, MODELSCOPE_MODEL, MODELSCOPE_BASE_URL

Configuration:

IMAGE_BACKEND=modelscope

MODELSCOPE_API_KEY=your-api-key

MODELSCOPE_MODEL=Tongyi-MAI/Z-Image-Turbo (default)

MODELSCOPE_BASE_URL=https://api-inference.modelscope.cn/v1 (default)

Also updates .gitignore to exclude Python version files and build artifacts.

Ahusm added 2 commits May 2, 2026 13:45
Introduces a new image generation backend using ModelScope (魔塔社区)
API with support for Z-Image-Turbo and other models.

Key features:
- Async task-based generation workflow with polling
- Multiple resolution presets (512px, 1K, 2K, 4K)
- Aspect ratio support (1:1, 3:4, 4:3, 9:16, 16:9)
- Configurable via MODELSCOPE_API_KEY, MODELSCOPE_MODEL, MODELSCOPE_BASE_URL

Configuration:
  IMAGE_BACKEND=modelscope
  MODELSCOPE_API_KEY=your-api-key
  MODELSCOPE_MODEL=Tongyi-MAI/Z-Image-Turbo (default)
  MODELSCOPE_BASE_URL=https://api-inference.modelscope.cn/v1 (default)

Also updates .gitignore to exclude Python version files and build artifacts.
Repository owner deleted a comment from chatgpt-codex-connector Bot May 2, 2026
@hugohe3

hugohe3 commented May 2, 2026

Copy link
Copy Markdown
Owner

I recommend not merging this PR yet. Adding ModelScope as an image backend is a reasonable direction and it fits as an experimental backend, but the current implementation does not meet the minimum behavior contract used by the existing image backends in this repository.

Blocking issues:

  1. In backend_modelscope.py, the SUCCEED branch saves the image and then only breaks; it never returns the saved path. The FAILED branch also only breaks. As a result, _generate_image() can implicitly return None, and the CLI may exit successfully without producing a usable image.
  2. Task polling uses an unbounded while True. If the remote task stays pending/running, the PPT generation flow can hang forever. The repo already has backend_common.poll_json() for ready/failed/timeout handling, so this should reuse it or implement equivalent total-timeout behavior.
  3. Image download uses raw requests.get(...).content without timeout, HTTP status validation, or content-type/format handling. Existing backends consistently use download_image() or save_image_bytes(); this backend should do the same.
  4. _resolve_url() uses rstrip("/v1"), which strips a character set rather than a suffix. It also forces the endpoint to end in .cn, which breaks MODELSCOPE_BASE_URL for proxies, test services, and compatible gateways.
  5. .gitignore should not ignore pyproject.toml; that would block future project-level Python/uv configuration. System.IO.Stream+NullStream also looks like a local artifact and should not be part of this PR.

Suggested minimal fixes:

  • On SUCCEED, validate that output_images is present, then return download_image(image_url, path).
  • On FAILED, raise a RuntimeError that includes the remote response payload.
  • Use poll_json(..., status_label="task_status", ready_values=["SUCCEED"], failed_values=["FAILED"]), or add equivalent bounded polling with a total timeout.
  • Use download_image() for the final image URL instead of reading .content directly.
  • Normalize only the optional /v1 suffix in the base URL; do not hard-code a domain allowlist.
  • Remove the unrelated .gitignore entries.

Platform-wise, ModelScope is worth supporting, especially for Chinese prompts, domestic access, and the open-source/LoRA model ecosystem. However, its image API examples vary between synchronous and asynchronous response shapes, so this backend needs conservative handling for task state, timeouts, missing fields, and download failures. I would keep it experimental after these fixes and avoid promoting it to a core backend for now.

Ahusm and others added 3 commits May 2, 2026 17:31
- 移动 .env.example 中 ModelScope 配置位置到Extended / Experimental backends区
- 实现 ModelScope 异步图任务轮询使用`skills\ppt-master\scripts\image_backends\backend_common.py`中的`poll_json()`函数
- 修复 ModelScope URL 解析逻辑,不强制以`.cn`结尾
- 从 .gitignore 移除上次添加的额外内容
@hugohe3 hugohe3 merged commit a64e28b into hugohe3:main May 2, 2026
mosjin added a commit to mosjin/ppt-master that referenced this pull request May 4, 2026
- shared-standards.md Basic SVG Rules:
  - 内容文字 font-size ≥ 15px (UI chrome 11-12px 除外)
  - 所有 <text> 必须显式携带 font-weight (最低500, 标题700+)
  - 明确: 缺失 font-weight 默认400 = 投影屏上细若发丝
  - 禁止 font-style="italic" 于内容文字
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants