MATHEMAGICA

Inspiration

Students often encounter STEM topics—whether it’s fractions, algebraic expressions, or geometry proofs—in text-heavy formats that can feel intimidating or abstract. We all struggle to connect the dots between theory and real-world applications when concepts are presented solely through formulas and paragraphs. Classroom engagement drops off when students can’t see the “story” behind the math or science.

What if STEM lessons could feel more like reading a comic book than memorizing definitions?

By tapping into the natural appeal of illustrated narratives and storytelling, we believed we could turn dry equations into adventures, spark curiosity, and foster deeper understanding in a way that feels like play rather than work.

Our Solution

Our project is a web-based platform that transforms any STEM topic—ranging from simple arithmetic to complex algebra—into a six-panel comic adventure. Students or teachers input a concept or problem statement, and under the hood we use a two-stage AI pipeline:

Grok LLM generates a structured, three-part storyline that personifies abstract ideas and embeds definitions, analogies, and cliffhangers to build curiosity.
ChatGPT takes that narrative and produces detailed image prompts for each panel—complete with character actions, speech bubbles, onomatopoeia, and educational captions.

The result is an auto-generated comic page where each panel blends vivid artwork and clear, readable text to introduce, explore, and conclude a STEM concept. By combining storytelling techniques with AI, our solution makes learning interactive, visually engaging, and emotionally resonant—so students remember the “why” behind the “how.”

How it works

Frontend Overview

The frontend presents a streamlined, visually engaging interface that guides users through comic creation in four clear stages: topic entry, progress indication, content reveal, and navigation & saving.

Site Navigation

Navbar: Accessible links to Home, Comics, and Library maintain consistent navigation.

Home Tab

Home Page

1. Topic Entry Upon arrival, the user encounters a clean, bright comic-themed home screen featuring:

Prominent Prompt Field: A single input box invites the user to enter any STEM topic.
Create Button: Once text is entered, the user clicks Create to begin. This action immediately transitions the interface to the next stage.

2. Progress Indication A panel opens up where the status of comic generation appears, displaying a linear five-step timeline:

Prompt Received
Generating Story: As the story parts are generated, they are displayed with their titles.
Generating Image Prompts
Generating Images
Saving to Database

Each step is represented by an icon that changes from a pending icon to a green check mark upon completion. This real‑time feedback assures the user of continual progress without technical jargon.

3. Content Reveal As the timeline advances, the user is presented with their custom comic content in sequence:

Chapter Title & Summary
When the narrative is ready, the comic’s chapter title appears prominently, followed by a concise summary in a styled caption box.
Six-Part Narrative
The story unfolds in six discrete sections—Part 1, Part 2, ... and Part 6—each introduced by its chapter title. As each section completes, it expands to display its narrative in clear, comic-style typography.
Panel-by-Panel Imagery
Immediately beneath each story section, the corresponding comic panel image fades into view. Placeholders maintain layout consistency until each illustration is available, ensuring a seamless reading experience.

4. Navigation & Saving Upon completion of all six panels, the generated comic can be viewed and read:

Book View
A View as Book toggle re-formats the content into a two-page spread. Each page displays its panel’s title, illustration, and narrative excerpt. Users navigate by:
- Clicking the left or right page edges to flip backward or forward.
- Using Previous and Next buttons for precise page turns.
Save Button
In the view, a persistent Save button enables the user to store their completed comic.

Comic Tab

Showcases the “Mathemagica: The Comic Chronicles” series with its tagline and sample illustration.

Featured Sample Comic

A representative comic panel illustrating mathematical domains like fractions and geometry in real‑world scenes appears side by side with their titles and page navigation.
Under Comics, we feature a brief welcome description inviting visitors to dive into sample math comics.

Library Tab

Saved Comics Grid
The page opens with a responsive grid of Comic Cards, one for each story the users have generated and saved.
Card Details:
- Title of the chapter
- Creation Date (e.g. “May 25, 2025”)
- Thumbnail: A cropped preview of one panel or book‑view spread
- Read Comic button
Reading Mode
Clicking Read Comic loads the selected story into the familiar two‑page Book View:
- Maintains the same panel titles, full‑width illustrations, and narrative text.
- Navigation controls (edge taps and Next/Previous buttons) function identically to the inline creation view.

By combining simple controls, a clear progress timeline, staged content reveals, and intuitive navigation—all within a cohesive comic-inspired design—the frontend delivers a professional yet engaging user experience.

Backend Overview

Our backend orchestrates a seamless pipeline that transforms a single STEM topic into a full three‐panel comic.
Below is a detailed walkthrough using “Pythagorean Theorem” as our running example.

1. Receiving the Request

User Action: The student enters “Pythagorean Theorem” and clicks Generate Comic.
Backend: A POST request is made to /api/comic with a JSON payload { "prompt": "Pythagorean Theorem" }. The API responds only when the full comic (text + images) is ready, ensuring the frontend receives a complete package in one go.

2. Generating the Story

Input: The raw topic string.
Process:
- The backend constructs a system prompt that instructs Grok LLM to produce a structured, three-part narrative in strict JSON form.
- The student’s topic is sent alongside this prompt to Grok via openai.chat.completions.create({ model: "grok-3", … }).
- Grok returns a JSON object containing:
- overall_chapter_name (e.g., “title..”)
- story_summary (“summary...”)
- parts array with six entries, each having a chapter_title and story_content.
Output: The backend parses and validates this JSON, then stores it in Supabase’s stories table under the current session.

3. Creating Image Prompts

Input: The concatenated text of all six story parts.
Process:
- Using a second, detailed image-prompt system prompt, the backend asks Grok to generate three image-prompt objects. Each object specifies panel layout, character positions, speech bubbles, sound effects, and educational captions.
- The call to Grok returns a JSON blob with an image_prompts array of six items.
Output: This array is parsed, validated against our schema, and then saved in Supabase’s image_prompts table linked to the story.

4. Rendering Comic Panels

Input: Each of the six image-prompt objects.
Process:
- For each prompt, the backend calls openai.images.generate({ model: "gpt-image-1", prompt }).
- If the response contains a direct base64 payload, it’s used immediately. If it returns a URL (DALL·E 3), the backend fetches the image and converts it to base64.
- Each base64 string is saved in Supabase’s images table, along with metadata: panel_number, prompt_id, and optional file path if persisted to disk.
Output: A collection of six base64-encoded PNGs representing each comic panel.

5. Final Response Payload
Once all steps complete, the /api/comic endpoint returns a single JSON response combining:

chapter_name: The overall title.
story_summary: The brief 3–4 line summary.
parts: The six narrative parts.
images: An array of six objects, each with panel_number and imageBase64.

6. Data Persistence with Supabase
To support history, edits, and sharing, we persist every element:

sessions: Records user_id, topic, and timestamp.
stories: Stores the full JSON narrative.
image_prompts: Keeps the JSON instructions for each panel.
images: Holds base64 data or public URLs of all rendered panels.

By processing story, prompts, and images in sequence—then returning a consolidated response—our backend delivers a turnkey comic generation experience. The frontend simply renders the text and embeds the base64 images, giving learners an engaging STEM adventure without any manual steps.

How we built it

1. Project & Dependency Setup

Framework: Next.js (API Routes) for serverless functions
SDKs:
- openai JavaScript client to interact with both Grok LLM (grok-3) and OpenAI’s Image API (gpt-image-1 / DALL·E 3)
- @supabase/supabase-js for database interaction
File Structure:

my-app/
├── app/                   # Next.js app directory
│   ├── page.tsx           # Home page
│   ├── book/              # Interactive Math Comic Book page
│   │   └── page.tsx       # Book page
│   ├── book-progress/     # Real-time comic generation
│   │   └── page.tsx       # Comic generation progress page
│   ├── explore/           # Comic library and exploration
│   │   └── page.tsx       # Explore comics page
│   ├── test-image-page/   # Image generation testing
│   │   └── page.tsx       # Test image generation
│   └── api/               # API routes
│       ├── comic/         # Comic story generation API (Grok)
│       │   └── route.ts
│       └── test-image/    # Image generation API (OpenAI)
│           └── route.ts
├── components/            # React components
│   ├── blocks/            # Layout blocks (hero-section, story-section)
│   ├── ui/                # UI components (button, badge, book, glow, mockup)
│   │   └── book.tsx       # Interactive Book component
│   ├── internal/          # Internal components (page-shell for navbar management)
│   └── navbar.tsx         # Site navigation with comic-book styling
└── lib/                   # Utility functions
    └── utils.ts           # Common utilities

2. Orchestrating the Comic Pipeline

Our /api/comic handler is the heart of the system. When a request arrives:

Receive Topic
- We parse { prompt: string } from the POST body. If it’s missing, we immediately return a 400 error.
Generate Story
We send the user’s topic to Grok LLM with our storySystemPrompt. That prompt defines a rigid JSON schema for a six-part narrative, complete with chapter titles, cliffhangers, and embedded STEM explanations.
Example snippet:
```
const storyResponse = await openai.chat.completions.create({
model: "grok-3",
messages: [
{ role: "system", content: storySystemPrompt },
{ role: "user",   content: `STEM Topic: ${topic}` }
],
response_format: { type: "json_object" },
});
```
We immediately parse and validate the JSON, ensuring all required fields are present. Any mismatch throws a descriptive error.
Save Story to Database
- Using our Supabase helper saveStory(topic, storyJson), we insert the entire JSON blob into a stories table. This enables history, revision, and sharing.
Generate Image Prompts
- We concatenate the six story parts into plain text and pass them to Grok again, this time with imagePromptSystemPrompt, which demands six fully detailed image-prompt objects.
- Each object specifies panel layout, speech bubbles, onomatopoeia, captions, and art-style notes.
- After parsing and validating, we store these in an image_prompts table via savePrompts(chapterName, promptsArray).
Render Panels
For each prompt in promptsArray, we call our /api/image helper function:
```
const { imageBase64 } = await fetch("/api/image", {
method: "POST",
body: JSON.stringify({ prompt: panel.promptText }),
}).then(res => res.json());
```
Inside /api/image, we use openai.images.generate to produce a base64 PNG (or fetch & convert a DALL·E URL).
Save & Return Images
- Each imageBase64 result is saved into an images table (linked by prompt ID), and we accumulate them into a final response payload:
  (code snippet)
- Inside /api/image, we use openai.images.generate to produce a base64 PNG (or fetch & convert a DALL·E URL).

3. Strict JSON Schemas & Validation

To ensure reliability, we defined two JSON schemas in code:

Story Schema:

{
  "overall_chapter_name": "string",
  "story_summary": "string",
  "parts": [ { "part_number": 1, "chapter_title": "string", "story_content": "string" } x6 ]
}

Image Prompt Schema: Includes id, title, panel_layout_description, and a panels array of exactly six items, each requiring panel_number, description, and dialogue_caption.
In TypeScript, right after JSON.parse, we check:

if (
  typeof storyJson.overall_chapter_name !== "string" ||
  !Array.isArray(storyJson.parts)     ||
  storyJson.parts.length !== 3
) {
  throw new Error("Invalid story JSON structure");
}

This prevents malformed AI outputs from cascading further into the pipeline.

4. Rendering Images with OpenAI

Our /api/image route is intentionally minimal:

Receive prompt string.
Call

const result = await openai.images.generate({
model: "gpt-image-1",
prompt,
size: "1024x1536",
quality: "medium",
});

Extract
If result.data[0].b64_json exists, use it directly.
Otherwise, fetch the URL and convert to base64 with Buffer.from(...).toString("base64").
Return { imageBase64: "<base64-data>" } to the caller.
This separation lets us reuse the same image-rendering logic whether it’s invoked from our main /api/comic or any other part of the app.

5. Persisting Data in Supabase

Books: After generating the six panel comic book, the book id, title, story content, images, and other details are stored in the books table which is later used to retrieve data into the library.
Book Images: As each prompt is rendered into an actual comic panel, we insert a record into the book images table. Each record references the matching prompt record’s ID, contains the base64 PNG data (or, optionally, a file path), and logs when it was created.

This architecture ensures a maintainable, extensible, and scalable backend—ready for further enhancements like user authentication, analytics, or new AI-driven features.

Challenges We Faced

Balancing Creativity and Accuracy Crafting prompts that produced engaging, story-driven comics while ensuring all STEM facts were correct required extensive iteration and testing.
Ensuring Consistent Comic Style Early image outputs varied in layout, color, and text placement. We refined our prompt guidelines to enforce uniform panel dimensions, lettering styles, and speech-bubble rules.
Coordinating Multiple API Calls Integrating the narrative LLM, image-prompt LLM, rendering API, and database storage into a single seamless workflow demanded careful sequencing, error handling, and clear frontend feedback.
Vercel Timeout Constraints On the free Vercel tier, functions cannot run longer than 59 seconds. Generating all six image prompts in one call was exceeding this limit. We had to resolve this by splitting the image-prompt endpoint into two sequential calls, each generating three prompts, keeping execution time under the limit.

What's next for Mathemagica

Looking ahead, we plan to enhance Mathemagica with features that deepen engagement and broaden its reach:

Embedded Dialogue in Artwork Right now, images and text render separately. Our next step is to integrate speech bubbles and captions directly into each panel image—making the comic feel more authentic and immersive.
User Accounts & Personalization Add login so learners can save favorite chapters, track progress, and receive AI-driven story recommendations tailored to their interests.
Interactive Exercises Embed quick quizzes and drag-and-drop activities into panels, transforming passive reading into active problem-solving.
Performance & Scalability Move heavy tasks like image-prompt generation into could to cut load times and support more concurrent users.

Why Mathemagica Stands Out

Mathemagica distinguishes itself through a blend of AI-driven innovation and learner-centric design:

Effortless Comic Creation A single prompt delivers a six-part narrative and full-color panels—no artistic skill or configuration needed.
Transparent, Engaging Workflow A live, five-step timeline (Prompt → Story → Prompts → Panels → Save) keeps users informed and turns wait-time into part of the experience.
Deep, Memorable Storytelling Concepts are personified as heroes and villains across cliffhanger-driven chapters, reinforcing understanding through a narrative arc.
Flexible, Intuitive Reader Switch seamlessly between scrollable panels and a two-page spread, with click-or-tap navigation that mirrors physical page-turning.
Persistent Library & Inspiration Every comic is saved with metadata for easy retrieval, while curated samples across STEM fields spark further exploration.

Together, these features make Mathemagica not just a tool, but a dynamic environment where STEM learning becomes visual, interactive, and inherently memorable.

Built With

chatgpt
dalle
grok
node.js
supabase