Skip to content

Bxzfrm/PRISM

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

PRISM: Prosody-Integrated Multi-Agent Reasoning Framework for Empathetic Spoken Dialogue

Overview

Empathetic spoken dialogue systems require not only semantically appropriate responses but also emotionally aligned prosodic expression. Existing cascade pipelines often discard rich acoustic cues during speech-to-text conversion, while end-to-end speech models lack interpretable control over emotion and knowledge integration.

PRISM addresses these limitations through a multi-agent framework that decouples speech perception, response generation, and speech synthesis into coordinated components. The framework introduces a prosody-to-language translation mechanism to stabilize large language model reasoning and supports on-demand invocation of external knowledge tools for empathetic dialogue generation.

Framework

Installation

Clone the repository

git clone https://github.com/yourname/PRISM.git
cd PRISM

Create environment

conda create -n prism python=3.10
conda activate prism

Install dependencies

pip install -r requirements.txt

Dataset

Experiments are conducted on public empathetic dialogue datasets.

Please download the datasets from their official sources before training and evaluation:

Speech Synthesis Model

For speech synthesis, we employ StyleTTS2 as the backbone TTS model.

StyleTTS2 can be obtained from:

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages