A Gemini AI Studio-powered utility for converting plain text into Speech Markdown Language (SML) with AI-inferred emotion and prosody.
This application uses Google's Gemini AI to automatically analyze text and generate Speech Markdown with appropriate prosodic elements including:
- Emotional tone inference
- Prosody markers (pitch, rate, volume)
- Natural speech patterns
- SSML-compatible output
- Drag & Drop Upload: Simple file upload interface for TXT or MD files
- AI-Powered Analysis: Gemini AI automatically infers emotional context and appropriate prosody
- Batch Processing: Handle entire books or long-form content
- Real-time Generation: See progress as your text is processed
Prerequisites: Node.js
-
Clone this repository:
git clone https://github.com/danielrosehill/Text-To-SSML-Generator.git cd Text-To-SSML-Generator -
Install dependencies:
npm install
-
Set up your Gemini API key:
- Copy
.env.local.exampleto.env.local(if needed) - Add your Gemini API key:
GEMINI_API_KEY=your_api_key_here
- Copy
-
Run the development server:
npm run dev
-
Open your browser and navigate to
http://localhost:3000(or the port shown in terminal)
- Click the upload area or drag and drop your TXT or MD file
- Click "Generate SML" to start the conversion
- Wait for the AI to process your text
- Download or copy the generated Speech Markdown
- Visit Google AI Studio
- Sign in with your Google account
- Navigate to "Get API Key"
- Create a new API key or use an existing one
- Audiobook Production: Add natural prosody to book narrations
- Voice Assistant Development: Create more expressive TTS output
- Content Accessibility: Enhance screen reader experiences
- Podcast Scripts: Add emotional markers to scripted content
- Built with Gemini AI Studio
- Node.js runtime
- Speech Markdown Language (SML) output format
Contributions welcome! Please feel free to submit issues or pull requests.
Prompt and idea: me (Daniel Rosehill)
Code: Gemini Pro 2.5 via AI Studio app builder


