Have you ever played Animal Crossing and wondered how those little voices work? No recorded dialogue, no voice actors, just a character that somehow feels like it's talking.
That's what this project is trying to understand.
GibGen is an experiment in procedural audio for game dialogue. Given a string of text, it generates a short audio clip that sounds like a character "speaking" it, without any real words, without any recordings, and without any audio assets at all.
Indie games often skip voiced dialogue entirely. The reasons are predictable: hiring voice actors is expensive, recording takes time, and localizing audio for multiple languages multiplies both problems.
But games like Animal Crossing, Undertale, and Banjo-Kazooie proved that you don't need real speech to make a character feel alive. A few carefully shaped bleeps and the player's brain fills in the rest.
I wanted to understand how that works, and build a tool that makes it accessible to anyone making a game.
GibGen has three synthesis modes, each a different approach to the same problem:
Procedural — pure waveforms (sine, square, sawtooth, triangle). Fast, retro, very 8-bit.
Phonemic — loads a folder of your own .wav samples and
stitches them together per character.
Formant — the interesting one. Uses biquad band-pass filters to shape a sawtooth oscillator into something that resembles vowel sounds. No audio files required. This is closest to how Animal Crossing actually works.
All three respect ADSR envelopes, pitch, speed, and variation parameters — and every output is seed-reproducible, so the same text always generates the same voice.
This is a work in progress and an active learning project. The core synthesis engine works. The GUI and CLI are functional. A lot of edges are still rough.
If you're looking for a production-ready tool, this isn't it yet. If you're curious about procedural audio or want to experiment with gibberish voices for a game jam, it might be exactly right.
(coming soon)
- Rust
- eframe / egui
- rodio
- hound