Skip to content

danielrosehill/Text-To-SSML-Generator

Repository files navigation

SML Prosody Generator

Vibe Coded Google AI Studio Gemini Node.js

A Gemini AI Studio-powered utility for converting plain text into Speech Markdown Language (SML) with AI-inferred emotion and prosody.

SML Prosody Generator Interface

Overview

This application uses Google's Gemini AI to automatically analyze text and generate Speech Markdown with appropriate prosodic elements including:

  • Emotional tone inference
  • Prosody markers (pitch, rate, volume)
  • Natural speech patterns
  • SSML-compatible output

Features

  • Drag & Drop Upload: Simple file upload interface for TXT or MD files
  • AI-Powered Analysis: Gemini AI automatically infers emotional context and appropriate prosody
  • Batch Processing: Handle entire books or long-form content
  • Real-time Generation: See progress as your text is processed

Screenshots

Upload Interface

Upload your text file

File Selected

File ready for processing

Processing

AI generating SML markup

Installation & Setup

Prerequisites: Node.js

  1. Clone this repository:

    git clone https://github.com/danielrosehill/Text-To-SSML-Generator.git
    cd Text-To-SSML-Generator
  2. Install dependencies:

    npm install
  3. Set up your Gemini API key:

    • Copy .env.local.example to .env.local (if needed)
    • Add your Gemini API key:
      GEMINI_API_KEY=your_api_key_here
      
  4. Run the development server:

    npm run dev
  5. Open your browser and navigate to http://localhost:3000 (or the port shown in terminal)

Usage

  1. Click the upload area or drag and drop your TXT or MD file
  2. Click "Generate SML" to start the conversion
  3. Wait for the AI to process your text
  4. Download or copy the generated Speech Markdown

Getting a Gemini API Key

  1. Visit Google AI Studio
  2. Sign in with your Google account
  3. Navigate to "Get API Key"
  4. Create a new API key or use an existing one

Use Cases

  • Audiobook Production: Add natural prosody to book narrations
  • Voice Assistant Development: Create more expressive TTS output
  • Content Accessibility: Enhance screen reader experiences
  • Podcast Scripts: Add emotional markers to scripted content

Technology Stack

  • Built with Gemini AI Studio
  • Node.js runtime
  • Speech Markdown Language (SML) output format

Contributing

Contributions welcome! Please feel free to submit issues or pull requests.

Author

Prompt and idea: me (Daniel Rosehill)

Code: Gemini Pro 2.5 via AI Studio app builder

Links

About

Generates SSML from text by inference

Topics

Resources

Stars

Watchers

Forks