Skip to content

NIK-TIGER-BILL/sberjazz-sdk

Repository files navigation

SberJazz SDK

Python SDK for automating SberJazz video conferences via Playwright.

Join a room as a bot, send messages, and play audio/video — all without a real user. Includes a ready-to-use Web UI for quick interactive testing.

Requirements

  • Python 3.11+
  • ffmpeg available in PATH

Installation

pip install -r requirements.txt
playwright install chromium

Quick start

import asyncio
from sberjazz import SberJazzClient

async def main():
    url = "https://jazz.sberbank.ru/your-room-id?psw=..."
    async with SberJazzClient(url, name="MyBot") as client:
        await client.send_message("Hello from the bot!")
        await client.play_audio("speech.mp3")
        await client.play_video("presentation.mp4")

asyncio.run(main())

API

Method Description
SberJazzClient(url, name, headless) Async context manager. __aenter__ opens a browser and joins the room; __aexit__ leaves and closes.
send_message(text) Send a chat message
mic_on() / mic_off() Toggle microphone
camera_on() / camera_off() Toggle camera
play_audio(path) Play an audio file (mp3/wav/…) through the microphone
play_video(path) Play a video file (mp4/…) through the camera and microphone
play_audio_stream(queue, src_rate, src_sample_width) Stream real-time PCM audio from an asyncio.Queue

play_audio_stream details

# Producer puts PCM chunks (bytes) into the queue; None signals EOF.
queue: asyncio.Queue[bytes | None]

await client.play_audio_stream(
    queue,
    src_rate=24000,       # source sample rate (must be a divisor of 48000)
    src_sample_width=2,   # bytes per sample: 2 = int16
)

Web UI

The ui/ folder contains a FastAPI backend and a single-page frontend for interactive testing.

python ui/server.py
# Open http://localhost:8765

Features:

  • Join / leave a room
  • Play audio files (drag & drop)
  • Play video files (drag & drop)
  • Clone a voice and synthesise text with Qwen TTS Realtime (requires DASHSCOPE_API_KEY)

Voice cloning example

# Set your API key
cp .env.example .env
# Edit .env and fill in DASHSCOPE_API_KEY and JAZZ_ROOM_URL

# Place a ≥5-second voice recording
cp your_voice.mp3 examples/voice_sample.mp3

python examples/voice_cloning_jazz.py

Running tests

pytest tests/ -v

Tests open a real browser and connect to a live Jazz room — set TEST_URL inside tests/test_client.py to a valid room URL.

Architecture

sberjazz/
├── __init__.py       # Public API: SberJazzClient
├── client.py         # Async context manager, join/leave flow
├── browser.py        # Playwright Chromium + WebRTC interception
├── chat.py           # Chat message sending
└── media.py          # AudioWorklet injection, PCM/video streaming

ui/
├── server.py         # FastAPI backend (REST API for the UI)
└── index.html        # Single-page frontend

examples/
└── voice_cloning_jazz.py   # End-to-end voice cloning + Jazz streaming

How WebRTC injection works:

  1. BrowserManager injects a script before any page JS runs. It patches RTCPeerConnection (to track all connections) and getUserMedia (to swap our custom tracks in).
  2. MediaController injects an AudioWorklet (ring-buffer PCM processor) and a hidden <canvas> for video.
  3. replaceTrack() swaps the worklet/canvas stream into existing WebRTC senders — no page reload needed.

License

MIT

About

Python SDK for automating SberJazz video conferences via Playwright

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors