AI-enhanced image playground built with Next.js App Router, Mantine UI, and Google Gemini. Upload an image, describe the transformation, and receive both AI-generated text and optionally a new image – with a clear per-request cost breakdown.
- Multimodal prompts: Send text with an optional image to Gemini.
- Image generation/transform: Receives inline image data when the model returns one.
- AI Image Enhancement: Restore and enhance image quality using CodeFormer face restoration.
- Cost transparency: Detailed token accounting and pricing shown in a modal.
- Beautiful UI: Mantine
AppShell, polished chat and editor panels. - Canvas controls: Zoom in/out, fit-to-screen, and one-click download.
- TypeScript-first: Strict types across client and server.
- Node.js 18+ recommended
- A Google Gemini API key
- A Replicate API key (optional, for image enhancement features)
pnpm install
# or
npm installCreate .env.local in the project root:
GEMINI_API_KEY=your_api_key_here
# Optional: For image enhancement features
REPLICATE_API_TOKEN=your_replicate_api_key_herepnpm dev
# or
npm run devOpen http://localhost:3000.
- Server route
app/api/generate-image/route.tscalls@google/genaiwith modelgemini-2.5-flash-image-previewand streams content. - The API returns JSON containing:
text: aggregated streamed textimage: optional{ data: base64, mimeType }cost: computed pricing for input/output tokens and image generation
- Client
app/components/ChatInterface.tsxhandles prompt + image upload and displays message history with a cost modal. - Client
app/components/EditorView.tsxrenders the generated image with zoom and download controls.
- Content-Type:
multipart/form-data - Body fields:
prompt(string, required)image(file, optional)
{
"text": "optional text",
"image": {
"data": "<base64>",
"mimeType": "image/png"
},
"cost": {
"inputTokens": 0,
"outputTokens": 0,
"inputImageTokens": 0,
"generatedImages": 0,
"totalTokens": 0,
"inputCost": 0,
"outputCost": 0,
"imageCost": 0,
"totalCost": 0,
"formattedCost": "$0.0000"
}
}400— Missingprompt500— Upstream or server error
curl -s -X POST http://localhost:3000/api/generate-image \
-F 'prompt=Make this photo look like a watercolor painting' \
-F 'image=@/path/to/photo.png'Enhance image quality using AI-powered face restoration via Replicate's CodeFormer.
- Content-Type:
multipart/form-data - Body fields:
image(file, required) - Image to enhancefidelity(string, optional) - Enhancement fidelity (0.1-1.0, default: 0.7)upscale(string, optional) - Upscale factor (1-4, default: 2)customApiKey(string, optional) - Custom Replicate API key
{
"success": true,
"enhancedImage": {
"data": "<base64>",
"mimeType": "image/png"
},
"originalImageUrl": "https://..."
}400— Missingimageor invalid API key408— Enhancement timeout (> 5 minutes)500— Upstream or server error
app/
api/generate-image/route.ts # Gemini streaming route, cost calculation
api/enhance-image/route.ts # Replicate CodeFormer enhancement
components/ChatInterface.tsx # Chat UI, uploads, cost modal
components/EditorView.tsx # Canvas, zoom, download
components/ImagePreviewModal.tsx # Image preview with enhancement
components/SettingsModal.tsx # API key management
page.tsx # Layout wiring with Mantine AppShell
public/
logo.svg
pnpm dev— start dev serverpnpm build— build for productionpnpm start— start production server
- Keep
GEMINI_API_KEYandREPLICATE_API_TOKENon the server (.env.local); never expose them in client bundles. - Requests to Gemini and Replicate are proxied via Next.js routes; clients never call these APIs directly.
- Custom API keys are stored locally in browser cookies and sent to server endpoints.
- One-click deploy on Vercel. Ensure
GEMINI_API_KEYand optionallyREPLICATE_API_TOKENare set in project environment variables.
Issues and pull requests are welcome. Please open an issue to discuss substantial changes.
- Next.js App Router
- Mantine UI
- Google Gemini API