Inspiration

We were inspired by wardrobe fatigue—the daily frustration of having a closet full of clothes but feeling like you have nothing to wear. We wanted to build something that could actively reason and suggest the right outfit for any context (weather, event, mood). This solution also addresses the uncertainty in picking outfits by providing a photorealistic try-on preview.


What it does

Gemini Wardrobe Style is an end-to-end AI Wardrobe Stylist.

  1. AI Styling Agent (Reasoning): The user provides a natural language prompt (e.g., "A warm, professional look for a rainy Tuesday"). The Gemini API analyzes this context, reasons over the descriptions of the pre-uploaded wardrobe items, and intelligently selects the best matching top, bottom, and outerwear.
  2. Virtual Try-On (Composition): The selected clothing images are fed back to the Gemini API along with the user’s photo. It generates a high-fidelity, photorealistic composite image of the user wearing the new outfit.

How we built it

We implemented a pipeline using the Gemini API as the core intelligence engine:

  • We used Gemini 2.5 Pro for the Styling Agent (reasoning and item selection) due to its strong contextual understanding.
  • We used the multi-modal capability of Gemini 2.5 Flash Image for the image generation and composition, seamlessly merging the clothing items onto the user's base photo.
  • The clothing metadata (color, type, material) was stored in a simple database or JSON file, which the Gemini model referenced using a well-structured prompt (RAG pattern) for its decision-making.

Challenges we ran into

  1. Prompt Engineering for Reasoning: The primary challenge was crafting the prompt to ensure Gemini consistently selected the most appropriate clothing items for the context, beyond just keywords.
  2. Complex Data Flow and State Management: Managing the asynchronous communication between the front-end (uploading the user photo and prompt) and the back-end (waiting for the multi-step, two-API-call process to finish) required meticulous handling of loading and error states to maintain a smooth user experience under time constraints.
  3. Image Consistency: Achieving perfect photorealism, especially with clothing folds and shadows, required careful tuning of the generateContent parameters for the visual output.

Accomplishments that we're proud of

  • Multi-Modal Pipeline: Successfully combining complex reasoning (text) with high-fidelity image generation (visual) using the Gemini API.

What we learned

We gained deep knowledge of Gemini's functionality grounding, learning how to structure data and prompts to turn a Large Language Model (LLM) into a reliable reasoning agent that makes constrained, logical decisions before executing a creative task (image generation).


What's next for AI Wardrobe Stylist

The immediate future involves evolving the Styling Agent into a full E-Commerce Agent. The user could provide constraints like an event, price, or brand, and the agent would use function calling to interface with live shopping APIs to find new items, and then generate the try-on image for a "try before you buy" shopping experience.

Built With

Share this project:

Updates