Inspiration

Google's Teachable Machine proved the pipeline in 2017: browser-based, no-code, train-on-your-webcam. We asked a different question: what would it take for a seven-year-old to do this alone — and want to come back tomorrow?

Three things, we think: no camera required (plenty of kids, parents, and classrooms won't turn one on — so your crayon drawings become the dataset), a pipeline you can see (every step is a named block you snap into place), and joy (a robot coach, sounds, confetti — the moment it learns should feel like magic you made).

What it does

ScratchML is a studio where kids train a real image classifier with zero code:

  • 🧩 Snap a recipeUse the Camera (or Use the Sketchpad) → Teach a Thing ×2+ → Train the BrainGuess It!
  • ✏️ No camera needed — a crayon sketchpad (7 colors, brush sizes, eraser, undo) makes drawings the training data — the wedge Teachable Machine never shipped
  • 🧠 Train in seconds — MobileNet transfer learning + a trainable head, entirely in the browser
  • 🎉 Guess live — confidence bars, a reactive robot mascot, confetti on the first confident guess
  • 🗑 Curate like a practitioner — inspect and delete individual training examples

Honest note: today the blocks form one guided recipe, not arbitrary programs — training wheels that teach the vocabulary of every ML system (input → examples → train → predict). Real composability is the roadmap, below.

Privacy is structural: camera frames and drawings never leave the device. There is no backend.

How we built it

  • ML: TensorFlow.js + MobileNet v2 (self-hosted weights); hardware WebGL with a WASM-SIMD fallback for machines without it — kids' products get opened on old school laptops
  • Studio: Next.js 16, TypeScript, Tailwind v4, dnd-kit, zustand; custom pointer-events sketchpad with coalesced input
  • Proof: a Playwright suite that draws shapes with the mouse, trains, and asserts the model recognizes a new drawing — run against the live production URL
  • Shipping: Docker on Fly.io; Novus by Pendo measuring from day one; the model pre-loads while visitors read the landing page
  • Tooling: built with Claude Code as the development pair, ElevenLabs for sound, OpenAI image models for art — the human owned every product decision, scope cut, and live-site bug report

Challenges we ran into

  • The self-erasing sketchpad — "my drag doesn't register" was a React effect with an unstable dependency repainting the canvas white whenever an unrelated toast dismissed. Two-line fix, found by measuring instead of guessing.
  • The invisible GO block — Tailwind v4 tree-shakes tokens it can't see in class names; our runtime-composed block colors got deleted from the CSS and the most important button rendered white-on-transparent.
  • The 17-second brain — without hardware WebGL, shader compilation took 17s and the CPU fallback froze the page. The WASM backend cut inference ~100×; the e2e suite went from 4.5 minutes to 49 seconds.
  • The model that learned the "wrong" thing — circles drawn in ink, squares in red crayon: it classified a red circle as a square. Not a bug — color really did separate the examples. It's now a coaching moment in the product.
  • A bug only real use found — picking a demo after the model loaded stranded the GO button on "loading brain…" forever. The user's exact flow became a regression test: verified failing on production, then fixed.

Accomplishments that we're proud of

  • A stranger can open a URL and train a neural network in about two minutes — on a phone, no account, no camera required
  • The e2e suite tests actual learning, not just UI
  • The privacy claim survives a network-tab audit
  • WCAG AA contrast, focus-trapped dialogs, and keyboard-operable capture — on candy-colored blocks

What we learned

  • Name your prior art. Early drafts implied more novelty than we'd earned; "Teachable Machine, rebuilt as a toy" is truer and stronger.
  • Variety is the curriculum. A model learns whatever separates your examples — so teaching kids to vary their drawings isn't a tip, it's the lesson.
  • Perceived performance beats raw performance. Pre-loading during the landing page made the loading screen effectively disappear.

What's next for ScratchML

One idea: earn the Scratch comparison.

  • Action blocks"when it sees ⭕ → play a sound." The trained model becomes a sensor kids can program against; the blocks become a language, not a recipe.
  • Fool-my-model links — friends try to draw something that breaks your model. Adversarial thinking, disguised as a game.
  • A second sense — sound classification through the same blocks, proving the vocabulary generalizes.

Built With

Share this project:

Updates