Skip to content

rda-run/tekstobot

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

22 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

TekstoBot πŸ€–

TekstoBot is a WhatsApp processing service that utilizes Artificial Intelligence to automatically transcribe speech to text (STT), with support for local Whisper servers and Cloudflare Workers AI.

πŸš€ Features

  • Audio Transcription: Receive voice messages and get the corresponding text via local Whisper or Cloudflare Workers AI (whisper-large-v3-turbo).
  • Administrative Dashboard: Web interface to manage authorized numbers, view history, and monitor connection status.
  • Whitelist: Only authorized numbers in the database can interact with the bot.
  • Asynchronous Processing: Uses goroutines and worker pools to ensure the bot remains responsive.

πŸ› οΈ Tech Stack

πŸ“‹ Prerequisites

  1. Go 1.21+ installed.
  2. PostgreSQL running locally.
  3. Podman (or Docker) for local Whisper server mode.
  4. psql (PostgreSQL client) to run migrations via Makefile.
  5. For Cloudflare mode: Cloudflare account with Workers AI enabled, account ID, and API token.

πŸ”§ Prepare GPU Acceleration (Additional for Podman on AlmaLinux/RHEL)

For the voice transcription (make whisper) to natively access your GPU through Podman using CDI mapping:

  1. Add the official NVIDIA repository:

    curl -s -L https://nvidia.github.io/libnvidia-container/stable/rpm/nvidia-container-toolkit.repo | \
      sudo tee /etc/yum.repos.d/nvidia-container-toolkit.repo
  2. Install the Toolkit:

    sudo dnf install -y nvidia-container-toolkit
  3. Generate Hardware Descriptors (CDI) so Podman recognizes the environment:

    sudo mkdir -p /etc/cdi
    sudo nvidia-ctk cdi generate --output=/etc/cdi/nvidia.yaml

βš™οΈ Configuration

  1. Clone the repository:

    git clone <repo-url>
    cd tekstobot
  2. Configure the environment variables:

    cp .env.example .env
    # Edit the .env file with your database credentials
  3. Install dependencies:

    go mod tidy

Transcription backend configuration

TekstoBot supports two transcription backends:

  • TRANSCRIBER_BACKEND=local (default): uses WHISPER_URL.
  • TRANSCRIBER_BACKEND=cloudflare: uses Cloudflare Workers AI endpoint @cf/openai/whisper-large-v3-turbo.

Example local configuration:

TRANSCRIBER_BACKEND=local
WHISPER_URL=http://localhost:8000
WHISPER_HEALTH_INTERVAL=30

Example Cloudflare configuration:

TRANSCRIBER_BACKEND=cloudflare
CLOUDFLARE_ACCOUNT_ID=<your-account-id>
CLOUDFLARE_API_TOKEN=<your-api-token>
# Optional ISO 639-1 language code. Leave empty for auto-detection.
CLOUDFLARE_WHISPER_LANGUAGE=en

When Cloudflare backend is selected, CLOUDFLARE_ACCOUNT_ID and CLOUDFLARE_API_TOKEN are required and validated during startup.

Authentication (OIDC)

TekstoBot supports OpenID Connect (OIDC) to protect the administrative dashboard. This is disabled by default (OIDC_ENABLED=false). When enabled, all dashboard routes and media files require a valid session.

Configuration variables

  • OIDC_ENABLED: Set to true to enable authentication.
  • OIDC_ISSUER_URL: The base URL of your OIDC provider.
  • OIDC_CLIENT_ID: The Client ID generated by your provider.
  • OIDC_CLIENT_SECRET: The Client Secret generated by your provider.
  • OIDC_REDIRECT_URL: The callback URL (e.g., http://localhost:8080/auth/callback).
  • OIDC_SESSION_TTL: Session duration in hours (default: 24).

Pocket ID Example

To use Pocket ID as your provider:

  1. In Pocket ID Admin, create a new client.
  2. Set the Redirect URL to your TekstoBot callback address.
  3. Add the following to your .env:
OIDC_ENABLED=true
OIDC_ISSUER_URL=https://id.yourdomain.com
OIDC_CLIENT_ID=0123456789abcdef
OIDC_CLIENT_SECRET=your_pocket_id_secret
OIDC_REDIRECT_URL=http://localhost:8080/auth/callback
OIDC_SESSION_TTL=24

πŸƒ How to Run

The project uses a Makefile to simplify common commands:

  1. Run Migrations:

    make migrate-up
  2. Start Whisper (Audio AI, local backend only):

    make whisper
  3. Run the Bot:

    make run
  4. Access the Dashboard: Open your browser at http://localhost:8080 (or the port defined in your .env).

πŸ“– Makefile Commands

Run make to see the list of available commands:

  • make build: Build the binary.
  • make release: Build the optimized binary for production.
  • make package: Generate the RPM package using nfpm.
  • make run: Run the project locally.
  • make whisper: Start the Whisper container via Podman.
  • make whisper-stop: Stop and remove the Whisper container.
  • make migrate-up: Run up migrations.
  • make migrate-down: Rollback migrations.
  • make check: Run build and import checks.

Cloudflare backend notes

  • Endpoint used by TekstoBot: POST /client/v4/accounts/{ACCOUNT_ID}/ai/run/@cf/openai/whisper-large-v3-turbo
  • Authentication: Authorization: Bearer <CLOUDFLARE_API_TOKEN>
  • Input format: base64-encoded audio bytes.

Limits and quotas

Cloudflare Workers AI limits and pricing are managed by Cloudflare and can change over time. At the time of writing:

  • Automatic Speech Recognition default rate limit: 720 requests/minute.
  • whisper-large-v3-turbo pricing is charged by audio minute.

Always verify the latest values in Cloudflare docs:

Troubleshooting (Cloudflare mode)

  • Startup fails with missing Cloudflare credentials: set CLOUDFLARE_ACCOUNT_ID and CLOUDFLARE_API_TOKEN.
  • 401/403 errors: verify token validity, scopes, and account ID.
  • 429 errors / throttling: request rate exceeded; apply retry/backoff in your deployment workflow and monitor usage.
  • Transcription errors in UI: detailed backend errors are stored and shown in media history to aid diagnosis.

πŸ“¦ Production Deployment (RPM + Quadlets)

For production environments on AlmaLinux/RHEL, TekstoBot can be installed as an RPM package that automatically manages the Whisper and Bot containers via Podman Quadlets and systemd.

1. Requirements for Building the Package

Ensure you have the nfpm utility installed to generate the package:

# Example nfpm installation (Go-based tool)
go install github.com/goreleaser/nfpm/v2/cmd/nfpm@latest

2. Generate and Install the Package

# Build the optimized binary and generate the RPM
make package

# Install the generated package
sudo dnf install ./tekstobot.rpm

3. Automatic GPU Configuration

The RPM post-install script detects if NVIDIA support is present on the host.

  • With GPU: Whisper starts using the cuda image and CDI mapping.
  • Without GPU: Whisper automatically falls back to cpu mode.

4. Service Initialization

Services are managed by systemd. On each start or restart of tekstobot, systemd reads TRANSCRIBER_BACKEND from /etc/tekstobot.env and either starts the local Whisper Quadlet (local) or stops it (cloudflare) so you do not need a separate sync step.

# Configure your .env (required before starting)
sudo cp /etc/tekstobot.env.example /etc/tekstobot.env
sudo vi /etc/tekstobot.env

# Enable and start TekstoBot
sudo systemctl enable --now tekstobot

# Follow the logs
sudo journalctl -u tekstobot -f
  • TRANSCRIBER_BACKEND=local: the Podman Whisper service is started before the bot.
  • TRANSCRIBER_BACKEND=cloudflare: the local Whisper service is stopped and not used; configure Cloudflare credentials in the same file.

After changing TRANSCRIBER_BACKEND (or other settings), run sudo systemctl restart tekstobot to apply them.

πŸ“„ License

This project is under the MIT license. See the LICENSE file for details.

About

An open-source WhatsApp bot that transcribes voice messages to text using local AI. Fully self-hosted, private, and blazing fast with GPU acceleration.

Resources

License

Stars

Watchers

Forks

Contributors