Inspiration
The justice system often fails to accurately represent African American Vernacular English (AAVE) speakers due to transcription errors, leading to misinterpretations and unfair outcomes. Inspired by the need for linguistic equity and cultural understanding, we created "Voices Unheard" to ensure every voice is heard authentically and accurately.
What It Does
"Voices Unheard" is an AI-powered platform that:
- Transcribes AAVE speech into both Standard English and native AAVE formats.
- Provides insightful linguistic analysis to highlight key AAVE grammatical features.
- Bridges the gap between diverse dialects and technology, empowering fair legal proceedings, media representation, and education.
How We Built It
To meet the hackathon’s challenge of creating a truly original, AI-powered system, we engineered Voices Unheard from the ground up:
- Frontend: Developed in Streamlit for rapid deployment and an intuitive, cross-platform user experience.
- Backend: A Flask API pipeline processes user-submitted audio, routes it through our NLP and speech modules, and delivers structured output in real-time.
- Speech Recognition: Leveraged raw audio input via the SpeechRecognition library and real-time file streaming.
- AI Model Architecture:
- Custom rule-based and statistical Natural Language Processing(NLP) models designed around the syntax, phonology, and pragmatics of AAVE, not just Standard English.
- Extended with handcrafted grammar recognition patterns and dialect transformation logic (not reliant on wrapper APIs or prebuilt translation tools).
- Temporary File Management: Secure, real-time handling of audio uploads for seamless user interaction and model processing.
Challenges we ran into
- Limited datasets specific to AAVE for training accurate translation models.
- Balancing linguistic fidelity with accessibility in Standard English outputs.
- Ensuring seamless integration between real-time audio input and backend processing.
Accomplishments that we're proud of
- Successfully developed a dual-transcription system that preserves the authenticity of AAVE while providing clear Standard English translations.
- Created a scalable platform that addresses systemic inequities in transcription technology
- Fostered awareness of the importance of linguistic diversity in technology.
What we learned
- AAVE is structurally rich and rule-governed**, requiring deliberate, culturally sensitive modeling—not “corrections” or simplification.
- Cross-domain collaboration** (linguistics, law, and AI) leads to deeper solutions that serve real people, not just demos.
- Building inclusive AI means embedding cultural understanding directly into the model pipeline—not just the UX.
What's next for Voices Unheard
Expand Training Data Develop or partner to curate datasets across regional and generational variations of AAVE, incorporating code-switching and phonetic variation.
Interpret Other Underrepresented Language Varieties Extend our linguistic engine to cover Chicano English, Appalachian English, Cajun Vernacular, and Native American English, helping marginalized speakers be understood in courts and institutions.
Real-Time & Live Use Cases Build out live courtroom, broadcast, and classroom integrations using browser-based transcription and real-time linguistic overlays.
Strategic Partnerships Collaborate with legal tech companies, public defenders, DEI consultants, and edtech platforms to amplify Voices Unheard's impact and integrate it into high-need ecosystems.
With Voices Unheard, we’re not just building software—we’re building a future where every dialect, culture, and voice is represented with dignity and precision in the systems that shape our lives.
Log in or sign up for Devpost to join the conversation.