Skip to content

SatyaPujith/FiboStudio

Repository files navigation

FiboStudio

A professional 3D product photography studio powered by AI. Create stunning photorealistic product images using an intuitive 3D scene editor and advanced AI image generation with Gemini 3.

fibostudio React Three.js TypeScript MongoDB

Overview

fibostudio bridges the gap between 3D scene composition and AI-powered image generation. Instead of struggling with text prompts, you visually design your product scene in a real-time 3D editor, and our AI translates your exact camera angles, lighting, and composition into photorealistic images using Gemini 3.

Key Features

  • Interactive 3D Scene Editor - Position, rotate, and scale objects with intuitive transform controls
  • Real-time Lighting Control - Adjust key, fill, and rim lights with live preview
  • AI-Powered Image Generation - Generate photorealistic product photos using Gemini 3
  • Voice-Controlled Studio Director - Use natural language voice commands with Eleven Labs text-to-speech to modify your scene
  • Precise Camera & Composition Control - Exact camera angles, lighting, and composition through structured parameters
  • Project Management - Save, organize, and manage multiple product photography projects
  • User Authentication - Secure signup/login with JWT, plus demo mode for quick testing
  • Production Gallery - View and download all generated images

Architecture

┌─────────────────────────────────────────────────────────────┐
│                      Frontend (React + Vite)                │
├─────────────────────────────────────────────────────────────┤
│  ┌─────────────┐  ┌─────────────┐  ┌─────────────────────┐  │
│  │  3D Scene   │  │   Studio    │  │     Dashboard       │  │
│  │  (Three.js) │  │  Controls   │  │   (Projects/Auth)   │  │
│  └─────────────┘  └─────────────┘  └─────────────────────┘  │
├─────────────────────────────────────────────────────────────┤
│                      Services Layer                         │
│  ┌─────────────────────────────┐  ┌─────────────────────┐   │
│  │    Gemini AI Service        │  │    API Service      │   │
│  │  (AI Image Generation)      │  │  (Backend Comm)     │   │
│  └─────────────────────────────┘  └─────────────────────┘   │
└─────────────────────────────────────────────────────────────┘
                              │
                              ▼
┌─────────────────────────────────────────────────────────────┐
│                   Backend (Express + Node.js)               │
├─────────────────────────────────────────────────────────────┤
│  ┌─────────────┐  ┌─────────────┐  ┌─────────────────────┐  │
│  │    Auth     │  │  Projects   │  │    Middleware       │  │
│  │   Routes    │  │   Routes    │  │   (JWT Auth)        │  │
│  └─────────────┘  └─────────────┘  └─────────────────────┘  │
├─────────────────────────────────────────────────────────────┤
│                     MongoDB Atlas                           │
│  ┌─────────────┐  ┌─────────────────────────────────────┐   │
│  │    Users    │  │              Projects               │   │
│  └─────────────┘  └─────────────────────────────────────┘   │
└─────────────────────────────────────────────────────────────┘

Tech Stack

Frontend

  • React 18 - UI framework
  • TypeScript - Type safety
  • Vite - Build tool and dev server
  • Three.js / React Three Fiber - 3D rendering
  • @react-three/drei - Three.js helpers
  • Tailwind CSS - Styling
  • Lucide React - Icons

Backend

  • Node.js - Runtime
  • Express - Web framework
  • MongoDB - Database
  • Mongoose - ODM
  • JWT - Authentication
  • bcryptjs - Password hashing

AI Services

  • BRIA FIBO - JSON-native photorealistic image generation
  • Google Gemini - Natural language prompt interpretation for Studio Director

Getting Started

Prerequisites

  • Node.js 18+
  • npm or yarn
  • MongoDB Atlas account (or local MongoDB)
  • FAL.ai API key (Get one here) - for image generation
  • Google Gemini API key (Get one here) - for Studio Director and image generation
  • Eleven Labs API key (Get one here) - for voice-controlled Studio Director with natural voice synthesis

Installation

  1. Clone the repository

    git clone https://github.com/SatyaPujith/fibostudio.git
    cd fibostudio
  2. Install frontend dependencies

    npm install
  3. Install backend dependencies

    cd server
    npm install
    cd ..
  4. Configure environment variables

    Create .env.local in the root directory:

    # Gemini API Key (for Studio Director - prompt interpretation)
    VITE_GEMINI_API_KEY=your_gemini_api_key
    API_KEY=your_gemini_api_key
    
    # FAL.ai API Key (for image generation)
    VITE_FAL_API_KEY=your_fal_api_key
    
    # Eleven Labs API Key (for voice-controlled Studio Director)
    VITE_ELEVENLABS_API_KEY=your_elevenlabs_api_key
    
    # Backend API URL
    VITE_API_URL=http://localhost:5000/api

    Create server/.env:

    # MongoDB Connection
    MONGODB_URI=mongodb+srv://username:password@cluster.mongodb.net/fibostudio
    
    # JWT Secret (generate a random string)
    JWT_SECRET=your_super_secret_jwt_key
    
    # Server Port
    PORT=5000
    
    # Frontend URL (for CORS)
    FRONTEND_URL=http://localhost:3000
    
    # FAL.ai API Key (for image generation)
    FAL_API_KEY=your_fal_api_key
  5. Start the backend server

    cd server
    npm run dev
  6. Start the frontend (in a new terminal)

    npm run dev
  7. Open your browser Navigate to http://localhost:3000

Usage

Quick Start (Demo Mode)

  1. Click "Try Demo" on the landing page
  2. A sample project will be created automatically
  3. Explore the 3D editor and generate images
  4. Note: Demo mode stores data locally only

Full Experience

  1. Sign Up - Create an account with email and password
  2. Create Project - Click "New Project" in the dashboard
  3. Design Scene:
    • Use transform tools (Move, Rotate, Scale) to position objects
    • Adjust lighting with the right panel controls
    • Apply mood presets (Clean, Dark, Warm, Cool)
  4. Generate Images:
    • Click "Generate Image" to open the batch dialog
    • Add variations or generate the current view
    • Images appear in the Production Gallery
  5. Download - Hover over any image and click the download button

Studio Controls

Control Action
Left Click + Drag Rotate camera
Right Click + Drag Pan camera
Scroll Zoom in/out
W/E/R Switch transform mode (Move/Rotate/Scale)
Click Object Select object

AI Prompt Director

Use voice commands or text in the "Studio Director" input with Eleven Labs voice synthesis:

  • "Make it look cinematic with dramatic lighting"
  • "Create a vintage camera"
  • "Add warm golden hour lighting"
  • "Make the background dark and moody"

The AI understands your intent and automatically adjusts the 3D scene, lighting, and composition accordingly. Your voice commands are processed with natural language understanding and synthesized back with Eleven Labs for a seamless experience.

API Reference

Authentication

Endpoint Method Description
/api/auth/signup POST Register new user
/api/auth/login POST Login user
/api/auth/demo POST Demo login (no database)
/api/auth/me GET Get current user

Projects

Endpoint Method Description
/api/projects GET List all projects
/api/projects/:id GET Get single project
/api/projects POST Create project
/api/projects/:id PUT Update project
/api/projects/:id DELETE Delete project
/api/projects/:id/images POST Add generated image
/api/projects/stats/summary GET Get user statistics

Project Structure

fibostudio/
├── components/
│   ├── AuthPage.tsx        # Login/Signup UI
│   ├── Dashboard.tsx       # Project management
│   ├── LandingPage.tsx     # Marketing page
│   ├── Scene3D.tsx         # Three.js 3D scene
│   ├── Studio.tsx          # Main editor interface
│   ├── VoiceInput.tsx      # Voice input with Eleven Labs
│   └── ...dialogs
├── services/
│   ├── apiService.ts       # Backend API client
│   ├── geminiService.ts    # Gemini AI integration
│   ├── voiceService.ts     # Voice recognition with Eleven Labs
│   └── storageService.ts   # Local storage
├── server/
│   ├── models/
│   │   ├── User.ts         # User schema
│   │   └── Project.ts      # Project schema
│   ├── routes/
│   │   ├── auth.ts         # Auth endpoints
│   │   ├── projects.ts     # Project endpoints
│   │   └── images.ts       # Image generation endpoints
│   ├── middleware/
│   │   └── auth.ts         # JWT middleware
│   └── index.ts            # Express server
├── App.tsx                 # Main app component
├── types.ts                # TypeScript types
├── constants.ts            # Default configs
└── index.tsx               # Entry point

Image Generation with Gemini 3

fibostudio uses Google Gemini 3 for intelligent image generation. The system works in two stages:

  1. Scene Understanding - Gemini analyzes your 3D scene setup (camera angle, lighting, objects, composition)
  2. Image Generation - FAL.ai's image generation API creates photorealistic images based on the scene parameters

The integration ensures that generated images match your 3D preview as closely as possible, with precise control over:

  • Camera angles and perspectives
  • Lighting setup and intensity
  • Object positioning and scale
  • Background and environment
  • Overall composition and framing
// Example scene parameters sent to image generation
{
  prompt: "Generate a photorealistic image of a white car viewed from the front and at eye level...",
  scene: {
    subject: "Car",
    background: "white",
    environment: "studio"
  },
  camera: {
    angle: "eye_level",
    shot_type: "full_shot",
    position: "front"
  },
  lighting: {
    type: "studio",
    direction: "front",
    intensity: "high"
  },
  style: {
    type: "photorealistic",
    quality: "ultra"
  }
}

Voice-Controlled Studio Director with Eleven Labs

fibostudio features an innovative voice-controlled interface powered by Eleven Labs text-to-speech:

  • Natural Voice Input - Speak commands naturally to modify your studio
  • Real-time Transcription - Your speech is converted to text instantly
  • Voice Feedback - Eleven Labs synthesizes natural-sounding responses
  • Seamless Integration - Voice commands are processed through Gemini AI for scene understanding

Simply click the microphone button, speak your command, and the studio updates automatically with voice feedback confirming the changes.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments


Built with ❤️ for creators who want precise control over AI-generated product photography.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published