Skip to content

h1yung/catapult-pu25

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

72 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Voices Unheard #catapult-pu25

We use AI to accurately transcribe&translate underrepresented dialects into Standard English while preserving its authenticity for fair legal outcomes, cultural understanding, inclusive technology.

Inspiration

The justice system often fails to accurately represent African American Vernacular English (AAVE) speakers due to transcription errors, leading to misinterpretations and unfair outcomes. Inspired by the need for linguistic equity and cultural understanding, we created "Voices Unheard" to ensure every voice is heard authentically and accurately.

What It Does

"Voices Unheard" is an AI-powered platform that:

  • Transcribes AAVE speech into both Standard English and native AAVE formats.
  • Provides insightful linguistic analysis to highlight key AAVE grammatical features.
  • Bridges the gap between diverse dialects and technology, empowering fair legal proceedings, media representation, and education.

How We Built It

To meet the hackathon’s challenge of creating a truly original, AI-powered system, we engineered Voices Unheard from the ground up:

  • Frontend: Developed in Streamlit for rapid deployment and an intuitive, cross-platform user experience.
  • Backend: A Flask API pipeline processes user-submitted audio, routes it through our NLP and speech modules, and delivers structured output in real-time.
  • Speech Recognition: Leveraged raw audio input via the SpeechRecognition library and real-time file streaming.
  • AI Model Architecture:
    • Custom rule-based and statistical Natural Language Processing(NLP) models designed around the syntax, phonology, and pragmatics of AAVE, not just Standard English.
    • Extended with handcrafted grammar recognition patterns and dialect transformation logic (not reliant on wrapper APIs or prebuilt translation tools).
  • Temporary File Management: Secure, real-time handling of audio uploads for seamless user interaction and model processing.

Challenges we ran into

  • Limited datasets specific to AAVE for training accurate translation models.
  • Balancing linguistic fidelity with accessibility in Standard English outputs.
  • Ensuring seamless integration between real-time audio input and backend processing.

Accomplishments that we're proud of

  • Successfully developed a dual-transcription system that preserves the authenticity of AAVE while providing clear Standard English translations.
  • Created a scalable platform that addresses systemic inequities in transcription technology
  • Fostered awareness of the importance of linguistic diversity in technology.

What we learned

  • AAVE is structurally rich and rule-governed**, requiring deliberate, culturally sensitive modeling—not “corrections” or simplification.
  • Cross-domain collaboration** (linguistics, law, and AI) leads to deeper solutions that serve real people, not just demos.
  • Building inclusive AI means embedding cultural understanding directly into the model pipeline—not just the UX.

What's next for Voices Unheard

  • Expand Training Data Develop or partner to curate datasets across regional and generational variations of AAVE, incorporating code-switching and phonetic variation.

  • Interpret Other Underrepresented Language Varieties Extend our linguistic engine to cover Chicano English, Appalachian English, Cajun Vernacular, and Native American English, helping marginalized speakers be understood in courts and institutions.

  • Real-Time & Live Use Cases Build out live courtroom, broadcast, and classroom integrations using browser-based transcription and real-time linguistic overlays.

  • Strategic Partnerships Collaborate with legal tech companies, public defenders, DEI consultants, and edtech platforms to amplify Voices Unheard's impact and integrate it into high-need ecosystems.

With Voices Unheard, we’re not just building software—we’re building a future where every dialect, culture, and voice is represented with dignity and precision in the systems that shape our lives.


Techs

  • Finetuned whisper-tiny-coraal
  • Finetuned Llama-3.2-1B-Instruct

Model Flow

flowchart TD
    Input([AAVAE speech]) --> Agent1[[Whisper]]
    Agent1 --> Agent2([AAVAE text])
    Agent2 --> AltOutput[AAVAE Text]
    Agent2 --> Agent3[[Llama3]]
    Agent3 --> Output[SE text]
    Agent3 --> Output2[Reasoning]
Loading

File structure

|_ dev
    |_ assets
    |_ pages #dev functions
    |_ utils
    |_ app.py #main file
    |_ requirements.txt
    |_ ...
|_ model
    |_ aave_model.py #AAVAE text to SE text
    |_ finetune2.py
    |_ raw_whisper.py #AAVAE speech to AAVAE text
    |_ ...

Running app on streamlit

streamlit run app.py

virtual environment

python3.11 -m venv .venv
source .venv/bin/activate (or source .venv/Scripts/activate when on Windows)
pip install requirements.txt

Dataset

  • ImanNalia/latest_coraal_train (Whisper)

https://huggingface.co/datasets/ImanNalia/latest_coraal_train

  • AAVE/SAE Paired Dataset (Llama3)

@inproceedings{groenwold-etal-2020-investigating, title = "Investigating {A}frican-{A}merican {V}ernacular {E}nglish in Transformer-Based Text Generation", author = "Groenwold, Sophie and Ou, Lily and Parekh, Aesha and Honnavalli, Samhita and Levy, Sharon and Mirza, Diba and Wang, William Yang", booktitle = "Proceedings of EMNLP", url = "https://www.aclweb.org/anthology/2020.emnlp-main.473", year = "2020" }

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors