Inspiration
When we heard of the superhero theme, we weren't sure where to start, but when we started chatting about comic books while sitting in an AI lab, the idea hit us. Why not attempt to generate comic books using AI?
What it does
ComicStrip is a web application that leverages several AI applications and APIs to generate a full set of comic frames from a single prompt.
How we built it
First, it sends the user's prompt to ChatGPT to get a fleshed-out story and a set of characters which are then developed into frame descriptions that can be sent to a fine-tuned diffusion model to generate images. We then use CLIPSeg to locate the characters in the frame based on their physical descriptions and attach text bubbles with dialogue. After several frames are compiled, it's sent to a React website to allow the user to view their results.
Challenges we ran into
- Providing strict prompts to ChatGPT
- Locating characters in the images
- Keeping characters consistent across frames
- Adding React elements
- Managing API access
- Keeping text bubbles from covering important stuff
Accomplishments that we're proud of
- Laying out a modular ChatGPT prompting system
- Using segmentation models to locate characters
- Fine-tuning a diffusion model for comic book styling
- Rendering nice-looking text bubbles in Python
- Hosting our webpage
- Deploying a custom HuggingFace server, hosting a diffusion model
What we learned
- How many AI APIs operate, including OpenAI and HuggingFace
- Many of the foundational principles of prompt engineering
- Developing a tool with a distinct back and front end.
- Building websites with REACT
What's next for ComicStrip
- Larger comics
- User control on how to continue the story
- Higher fidelity images with less diffusion artifacts
Built With
- cloudflare
- huggingface
- javascript
- openai
- python
- react
Log in or sign up for Devpost to join the conversation.