Voices Unheard

Inspiration

The justice system often fails to accurately represent African American Vernacular English (AAVE) speakers due to transcription errors, leading to misinterpretations and unfair outcomes. Inspired by the need for linguistic equity and cultural understanding, we created "Voices Unheard" to ensure every voice is heard authentically and accurately.

What It Does

"Voices Unheard" is an AI-powered platform that:

Transcribes AAVE speech into both Standard English and native AAVE formats.
Provides insightful linguistic analysis to highlight key AAVE grammatical features.
Bridges the gap between diverse dialects and technology, empowering fair legal proceedings, media representation, and education.

How We Built It

To meet the hackathon’s challenge of creating a truly original, AI-powered system, we engineered Voices Unheard from the ground up:

Frontend: Developed in Streamlit for rapid deployment and an intuitive, cross-platform user experience.
Backend: A Flask API pipeline processes user-submitted audio, routes it through our NLP and speech modules, and delivers structured output in real-time.
Speech Recognition: Leveraged raw audio input via the SpeechRecognition library and real-time file streaming.
AI Model Architecture:
- Custom rule-based and statistical Natural Language Processing(NLP) models designed around the syntax, phonology, and pragmatics of AAVE, not just Standard English.
- Extended with handcrafted grammar recognition patterns and dialect transformation logic (not reliant on wrapper APIs or prebuilt translation tools).
Temporary File Management: Secure, real-time handling of audio uploads for seamless user interaction and model processing.

Challenges we ran into

Limited datasets specific to AAVE for training accurate translation models.
Balancing linguistic fidelity with accessibility in Standard English outputs.
Ensuring seamless integration between real-time audio input and backend processing.

Accomplishments that we're proud of

Successfully developed a dual-transcription system that preserves the authenticity of AAVE while providing clear Standard English translations.
Created a scalable platform that addresses systemic inequities in transcription technology
Fostered awareness of the importance of linguistic diversity in technology.

What we learned

AAVE is structurally rich and rule-governed**, requiring deliberate, culturally sensitive modeling—not “corrections” or simplification.
Cross-domain collaboration** (linguistics, law, and AI) leads to deeper solutions that serve real people, not just demos.
Building inclusive AI means embedding cultural understanding directly into the model pipeline—not just the UX.

What's next for Voices Unheard

Expand Training Data Develop or partner to curate datasets across regional and generational variations of AAVE, incorporating code-switching and phonetic variation.
Interpret Other Underrepresented Language Varieties Extend our linguistic engine to cover Chicano English, Appalachian English, Cajun Vernacular, and Native American English, helping marginalized speakers be understood in courts and institutions.
Real-Time & Live Use Cases Build out live courtroom, broadcast, and classroom integrations using browser-based transcription and real-time linguistic overlays.
Strategic Partnerships Collaborate with legal tech companies, public defenders, DEI consultants, and edtech platforms to amplify Voices Unheard's impact and integrate it into high-need ecosystems.

With Voices Unheard, we’re not just building software—we’re building a future where every dialect, culture, and voice is represented with dignity and precision in the systems that shape our lives.

Built With

flask
huggingface
llama
python
pytorch
streamlit
whisper

Submitted to

Catapult Hacks 2025

Created by

I worked on writing and editing of this devpost and presentation for Catapult Hackathon.

Alexander Borrero
I worked on fine-tuning the speech recognition model on the CORAAL dataset.

Liam Stonestreet
Daniel Park
Evan Chapple
Emma Mahan
Keisuke54 Nakamura

Updates

Daniel Park started this project — Apr 13, 2025 09:53 AM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.