toona

Panel Segmentation
Semantic Comic Understanding
Automatic Colorization

Inspiration

In today's digital world, traditional educational methods struggle to engage children, leading to poor learning outcomes. While static textbooks often fail to captivate tech-savvy learners, educational comics have shown promise in making complex subjects more approachable and engaging. Our inspiration came from "The Manga Guide to Calculus," which transformed a challenging topic into an accessible and enjoyable learning experience. This demonstrated the power of visual storytelling in education. We realized that by animating educational comics, we could further revolutionize learning, bridging the gap between static content and the interactive experiences modern students crave, ultimately improving engagement, retention, and academic performance.

Making education more engaging is a struggle facing educators ...

According to a study by Microsoft, the average attention span of humans has dropped from 12 seconds in 2000 to just 8 seconds in 2015, highlighting the need for more engaging educational content. [Source: https://time.com/3858309/attention-spans-goldfish/]

A report by Common Sense Media found that teenagers spend an average of 7 hours and 22 minutes per day on screen media for entertainment, not including time spent on screens for school or homework. [Source: https://www.commonsensemedia.org/research/the-common-sense-census-media-use-by-tweens-and-teens-2019]

Studies indicate that interactive learning methods can increase retention rates by 25% to 60% compared to traditional lecture-based learning. [Source: https://www.shiftelearning.com/blog/bid/301248/15-facts-and-stats-that-reveal-the-power-of-elearning]

What it does

ComicMotion AI transforms static black and white comics into dynamic, colorful animations. It analyzes the comic page, identifies characters and dialogue, colorizes the artwork, segments panels, generates AI voices for characters, and combines everything into an engaging animated video. This tool breathes life into educational comics created by educators, making them more appealing and effective for young learners.

How we built it

We developed ComicMotion AI using a combination of cutting-edge technologies:

Frontend: Next.js and Tailwind CSS for a responsive, modern web interface
Semantic Analysis: Open source model "magi" for character and dialogue identification
Colorization: GAN (Generative Adversarial Network) for intelligent colorization
Panel Segmentation: YOLOv9 object detection model
Text Extraction: Advanced OCR (Optical Character Recognition)
Dialogue Processing: Claude 3.5 AWS Bedrock for character description generation
Voice Synthesis: Replica AI for voice generation
Animation Engine: Remotion (Browser based animation tool)

Challenges we ran into

One of our main challenges was dealing with hallucinations in our language models. The AI sometimes generated inaccurate or inconsistent character descriptions, which affected the overall coherence of the animations. Additionally, we struggled with the quality of AI-generated voices, which sometimes lacked the natural intonation and emotion required for engaging storytelling.

Accomplishments that we're proud of

We're proud of creating a fully automated system that can transform static comics into animations with minimal human intervention. Our application's ability to identify and track characters across panels is particularly impressive. We're also pleased with our GAN-based colorization, which adds vibrancy to the original artwork while maintaining its integrity.

What we learned

This project taught us the intricacies of integrating multiple AI models into a cohesive system. We gained valuable insights into the challenges of natural language processing, computer vision, and voice synthesis. We also better understood the importance of fine-tuning models to reduce hallucinations and improve output quality and will try doing this in the future.

What's next for toona

Looking ahead, we plan to:

Refine our language models to reduce hallucinations and improve character consistency
Enhance our voice synthesis to produce more natural and emotive speech
Expand our platform to support multiple languages for global accessibility
Develop tools for educators to easily customize and create their own animated educational content
Explore integration with AR/VR technologies for immersive learning experiences