A professional 3D product photography studio powered by AI. Create stunning photorealistic product images using an intuitive 3D scene editor and advanced AI image generation with Gemini 3.
fibostudio bridges the gap between 3D scene composition and AI-powered image generation. Instead of struggling with text prompts, you visually design your product scene in a real-time 3D editor, and our AI translates your exact camera angles, lighting, and composition into photorealistic images using Gemini 3.
- Interactive 3D Scene Editor - Position, rotate, and scale objects with intuitive transform controls
- Real-time Lighting Control - Adjust key, fill, and rim lights with live preview
- AI-Powered Image Generation - Generate photorealistic product photos using Gemini 3
- Voice-Controlled Studio Director - Use natural language voice commands with Eleven Labs text-to-speech to modify your scene
- Precise Camera & Composition Control - Exact camera angles, lighting, and composition through structured parameters
- Project Management - Save, organize, and manage multiple product photography projects
- User Authentication - Secure signup/login with JWT, plus demo mode for quick testing
- Production Gallery - View and download all generated images
┌─────────────────────────────────────────────────────────────┐
│ Frontend (React + Vite) │
├─────────────────────────────────────────────────────────────┤
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────────────┐ │
│ │ 3D Scene │ │ Studio │ │ Dashboard │ │
│ │ (Three.js) │ │ Controls │ │ (Projects/Auth) │ │
│ └─────────────┘ └─────────────┘ └─────────────────────┘ │
├─────────────────────────────────────────────────────────────┤
│ Services Layer │
│ ┌─────────────────────────────┐ ┌─────────────────────┐ │
│ │ Gemini AI Service │ │ API Service │ │
│ │ (AI Image Generation) │ │ (Backend Comm) │ │
│ └─────────────────────────────┘ └─────────────────────┘ │
└─────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────┐
│ Backend (Express + Node.js) │
├─────────────────────────────────────────────────────────────┤
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────────────┐ │
│ │ Auth │ │ Projects │ │ Middleware │ │
│ │ Routes │ │ Routes │ │ (JWT Auth) │ │
│ └─────────────┘ └─────────────┘ └─────────────────────┘ │
├─────────────────────────────────────────────────────────────┤
│ MongoDB Atlas │
│ ┌─────────────┐ ┌─────────────────────────────────────┐ │
│ │ Users │ │ Projects │ │
│ └─────────────┘ └─────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────┘
- React 18 - UI framework
- TypeScript - Type safety
- Vite - Build tool and dev server
- Three.js / React Three Fiber - 3D rendering
- @react-three/drei - Three.js helpers
- Tailwind CSS - Styling
- Lucide React - Icons
- Node.js - Runtime
- Express - Web framework
- MongoDB - Database
- Mongoose - ODM
- JWT - Authentication
- bcryptjs - Password hashing
- BRIA FIBO - JSON-native photorealistic image generation
- Google Gemini - Natural language prompt interpretation for Studio Director
- Node.js 18+
- npm or yarn
- MongoDB Atlas account (or local MongoDB)
- FAL.ai API key (Get one here) - for image generation
- Google Gemini API key (Get one here) - for Studio Director and image generation
- Eleven Labs API key (Get one here) - for voice-controlled Studio Director with natural voice synthesis
-
Clone the repository
git clone https://github.com/SatyaPujith/fibostudio.git cd fibostudio -
Install frontend dependencies
npm install
-
Install backend dependencies
cd server npm install cd ..
-
Configure environment variables
Create
.env.localin the root directory:# Gemini API Key (for Studio Director - prompt interpretation) VITE_GEMINI_API_KEY=your_gemini_api_key API_KEY=your_gemini_api_key # FAL.ai API Key (for image generation) VITE_FAL_API_KEY=your_fal_api_key # Eleven Labs API Key (for voice-controlled Studio Director) VITE_ELEVENLABS_API_KEY=your_elevenlabs_api_key # Backend API URL VITE_API_URL=http://localhost:5000/api
Create
server/.env:# MongoDB Connection MONGODB_URI=mongodb+srv://username:password@cluster.mongodb.net/fibostudio # JWT Secret (generate a random string) JWT_SECRET=your_super_secret_jwt_key # Server Port PORT=5000 # Frontend URL (for CORS) FRONTEND_URL=http://localhost:3000 # FAL.ai API Key (for image generation) FAL_API_KEY=your_fal_api_key
-
Start the backend server
cd server npm run dev -
Start the frontend (in a new terminal)
npm run dev
-
Open your browser Navigate to
http://localhost:3000
- Click "Try Demo" on the landing page
- A sample project will be created automatically
- Explore the 3D editor and generate images
- Note: Demo mode stores data locally only
- Sign Up - Create an account with email and password
- Create Project - Click "New Project" in the dashboard
- Design Scene:
- Use transform tools (Move, Rotate, Scale) to position objects
- Adjust lighting with the right panel controls
- Apply mood presets (Clean, Dark, Warm, Cool)
- Generate Images:
- Click "Generate Image" to open the batch dialog
- Add variations or generate the current view
- Images appear in the Production Gallery
- Download - Hover over any image and click the download button
| Control | Action |
|---|---|
| Left Click + Drag | Rotate camera |
| Right Click + Drag | Pan camera |
| Scroll | Zoom in/out |
| W/E/R | Switch transform mode (Move/Rotate/Scale) |
| Click Object | Select object |
Use voice commands or text in the "Studio Director" input with Eleven Labs voice synthesis:
- "Make it look cinematic with dramatic lighting"
- "Create a vintage camera"
- "Add warm golden hour lighting"
- "Make the background dark and moody"
The AI understands your intent and automatically adjusts the 3D scene, lighting, and composition accordingly. Your voice commands are processed with natural language understanding and synthesized back with Eleven Labs for a seamless experience.
| Endpoint | Method | Description |
|---|---|---|
/api/auth/signup |
POST | Register new user |
/api/auth/login |
POST | Login user |
/api/auth/demo |
POST | Demo login (no database) |
/api/auth/me |
GET | Get current user |
| Endpoint | Method | Description |
|---|---|---|
/api/projects |
GET | List all projects |
/api/projects/:id |
GET | Get single project |
/api/projects |
POST | Create project |
/api/projects/:id |
PUT | Update project |
/api/projects/:id |
DELETE | Delete project |
/api/projects/:id/images |
POST | Add generated image |
/api/projects/stats/summary |
GET | Get user statistics |
fibostudio/
├── components/
│ ├── AuthPage.tsx # Login/Signup UI
│ ├── Dashboard.tsx # Project management
│ ├── LandingPage.tsx # Marketing page
│ ├── Scene3D.tsx # Three.js 3D scene
│ ├── Studio.tsx # Main editor interface
│ ├── VoiceInput.tsx # Voice input with Eleven Labs
│ └── ...dialogs
├── services/
│ ├── apiService.ts # Backend API client
│ ├── geminiService.ts # Gemini AI integration
│ ├── voiceService.ts # Voice recognition with Eleven Labs
│ └── storageService.ts # Local storage
├── server/
│ ├── models/
│ │ ├── User.ts # User schema
│ │ └── Project.ts # Project schema
│ ├── routes/
│ │ ├── auth.ts # Auth endpoints
│ │ ├── projects.ts # Project endpoints
│ │ └── images.ts # Image generation endpoints
│ ├── middleware/
│ │ └── auth.ts # JWT middleware
│ └── index.ts # Express server
├── App.tsx # Main app component
├── types.ts # TypeScript types
├── constants.ts # Default configs
└── index.tsx # Entry point
fibostudio uses Google Gemini 3 for intelligent image generation. The system works in two stages:
- Scene Understanding - Gemini analyzes your 3D scene setup (camera angle, lighting, objects, composition)
- Image Generation - FAL.ai's image generation API creates photorealistic images based on the scene parameters
The integration ensures that generated images match your 3D preview as closely as possible, with precise control over:
- Camera angles and perspectives
- Lighting setup and intensity
- Object positioning and scale
- Background and environment
- Overall composition and framing
// Example scene parameters sent to image generation
{
prompt: "Generate a photorealistic image of a white car viewed from the front and at eye level...",
scene: {
subject: "Car",
background: "white",
environment: "studio"
},
camera: {
angle: "eye_level",
shot_type: "full_shot",
position: "front"
},
lighting: {
type: "studio",
direction: "front",
intensity: "high"
},
style: {
type: "photorealistic",
quality: "ultra"
}
}fibostudio features an innovative voice-controlled interface powered by Eleven Labs text-to-speech:
- Natural Voice Input - Speak commands naturally to modify your studio
- Real-time Transcription - Your speech is converted to text instantly
- Voice Feedback - Eleven Labs synthesizes natural-sounding responses
- Seamless Integration - Voice commands are processed through Gemini AI for scene understanding
Simply click the microphone button, speak your command, and the studio updates automatically with voice feedback confirming the changes.
This project is licensed under the MIT License - see the LICENSE file for details.
- Google Gemini for AI-powered scene understanding and image generation
- FAL.ai for reliable image generation infrastructure
- Eleven Labs for natural voice synthesis
- Three.js and React Three Fiber for 3D rendering
- Tailwind CSS for styling
Built with ❤️ for creators who want precise control over AI-generated product photography.