Forge Master — Text-to-3D on Cloud Run (GPU)
Elevator pitch. Forge Master turns short text prompts into production-ready 3D assets—generated, assessed, and delivered in under two minutes. It’s a fully serverless, Cloud Run–native app that uses an L4 GPU to run an open-source 3D pipeline end-to-end, with Google AI for image synthesis and quality analysis.
Try it: Live demo at the public frontend and a clean, documented repo with deployment scripts and architecture notes.
Inspiration
Creating usable 3D assets is still slow and expensive. We wanted a one-click web app that produces game-ready meshes quickly, repeatably, and at a known unit cost—without managing clusters. Cloud Run with GPUs made that feasible for a weekend build, and the Google AI ecosystem closed the quality gap.
What it does
Type a prompt, get a 3D model you can inspect and download as GLB, OBJ, FBX, or STL. Forge Master:
- Generates a studio-lit reference image from your prompt
- Produces multi-view renders and reconstructs a watertight mesh
- Scores quality automatically and, if needed, iterates for improvements
- Serves an in-browser 3D viewer and one-click downloads
Typical runtime: ~90 seconds. Success rate >95%. Average quality score 8.2/10.
How we built it (GPU Category)
Three services on Cloud Run with clear contracts and defense-in-depth:
Frontend (Next.js + React Three Fiber) Public UI, real-time progress, interactive viewer, multi-format download buttons.
Agent Service (FastAPI + Gemini) A four-agent pipeline: prompt enhancement, generation coordination, quality assurance, and iterative improvement. Handles retries, structured results, and logging. Public endpoint consumed only by the frontend.
GPU Service (FastAPI on L4 GPU, europe-west1) Text→image via Imagen 4, image→multi-view via Zero123++, multi-view→mesh via LRM, followed by mesh post-processing (smoothing, component isolation, scaling) and multi-format export. Uploads artifacts to Cloud Storage and returns signed URLs.
Security and cost control
- GPU service is IAM-protected; only the agent service can invoke it (403 to the public).
- Frontend and agent are public; the most expensive path is never directly reachable.
- Fixed, observable cost per model (~$0.56) with clear breakdown (GPU time dominates).
Why Google Cloud
- Cloud Run (GPU): deploy a full PyTorch + Diffusers stack on L4 with autoscaling and zero-to-N behavior—no cluster admin.
- Imagen 4 and Gemini: higher-quality inputs and consistent QA, accessed via a single credentials model.
- Cloud Storage: durable, CDN-friendly artifact hosting with straightforward CORS and lifecycle policies.
- Cloud Build: push-button CI/CD for three services.
Challenges we ran into
- End-to-end latency: minimized by keeping images small but sufficient for reconstruction, and by reducing post-processing passes.
- Model stability on L4: tuned mixed precision and batch sizes; added health checks and timeouts to avoid stuck reconstructions.
- Public abuse vs. usability: solved with Cloud Run IAM—agent can call GPU, browsers cannot.
Accomplishments we’re proud of
- A clean three-tier architecture that scales from hackathon to production without rewrites.
- A full open-source 3D pipeline (InstantMesh) running reliably on Cloud Run GPU.
- A measurable quality system: per-model stats, scoring, and targeted re-generation when scores fall short.
What we learned
- Cloud Run GPUs are practical for on-demand inference when you bound the workload and secure the hot path.
- Quality is a system property: prompt design (Gemini) + reference image (Imagen) + mesh post-processing matter as much as the core 3D model.
- IAM is the simplest and most effective line of defense for cost control in public AI apps.
What’s next
- Batch jobs for bulk asset generation and dataset creation (Cloud Run Jobs).
- Optional texture baking and PBR maps.
- Project workspaces, history, and fine-grained rate limits.
- Veo/animated asset experiments for simple motion previews.
How this meets the judging criteria
Technical Implementation (40%)
- Well-executed, documented codebase: three independently deployable services, FastAPI type-validated endpoints, structured logs, and health checks.
- Core Cloud Run concepts: service-to-service auth via IAM, GPU service isolation, autoscaling, zero-to-idle, and CI/CD via Cloud Build.
- Production signals: error handling, bounded timeouts, QA scoring, iterative retries, and deterministic cost per request.
- UX: responsive Next.js frontend, progress states, and an integrated 3D viewer.
Demo & Presentation (40%)
- Clear problem statement: fast, affordable, serverless 3D generation.
- Effective walkthrough: prompt → image → multi-view → mesh → download, with a quality badge.
- Architecture diagram and docs: included in the repo with deployment guides and service READMEs.
- Live link: public demo to test real prompts and download results.
Innovation & Creativity (20%)
- Novel combination: serverless GPU reconstruction paired with Google Imagen and Gemini for higher-quality inputs and automated QA.
- Impact: compresses a traditionally manual pipeline into a single web flow with predictable cost, enabling indie games, prototyping, and education.
Built With
- cloud-build
- cloud-run-(gpu-and-standard)
- cloud-storage
- fastapi
- gemini
- imagen
- imagen-3
- nextjs

Log in or sign up for Devpost to join the conversation.