Skip to content

escape0707/telegram-summary-bot

Repository files navigation

Telegram Summary Bot

A Cloudflare Worker bot that ingests Telegram group messages and produces on-demand and daily AI summaries.

Features

  • Store non-command group messages in D1.
  • /summary [Nh [Mh]] for custom window summaries.
  • /summaryday alias for last 24h.
  • /status for service snapshot and split real/synthetic summary metrics.
  • Daily cron summary dispatch (08:00 UTC / 17:00 JST).
  • Summary command rate limiting (per-user-in-chat and per-chat fixed windows).
  • GitHub Actions CI (typecheck) and manual CD workflow.

Architecture

  • Ingress: Telegram sends updates to Worker webhook (/telegram), which validates secret headers, parses commands, enqueues summary jobs, and ingests group messages.
  • Storage: D1 stores raw messages, summaries, service stats, and rate-limit counters, plus queue-idempotency claim records.
  • Summarization: shared application pipeline loads windowed messages and calls Workers AI for clustered summaries in queue consumers.
  • Scheduling: Cloudflare Cron enqueues daily summary jobs at 08:00 UTC.
  • Queue: Cloudflare Queues (summary-jobs) processes one job per batch (max_batch_size=1, max_batch_timeout=0) to minimize delivery latency.
  • Ops: tracked wrappers persist success/error state for /status and incident debugging.

Queue-Based Summary Flow

  • /summary and /summaryday in allowlisted groups are enqueue-only:
    • Webhook applies rate limiting.
    • Webhook enqueues an on-demand summary job.
    • Webhook returns 200 immediately (no interim "processing" reply).
    • The queue consumer posts the actual reply when processing completes.
  • Daily cron is enqueue-only:
    • Loads active chats in the last 24h.
    • Skips non-allowlisted chats.
    • Enqueues one daily summary job per target chat.
  • Queue consumer behavior:
    • Daily jobs: summarize and post to chat.
    • On-demand jobs: summarize and reply to the source command message.
    • Delivery is at-least-once; processing is protected with D1 claim idempotency.

Queue Retry and Failure Semantics

  • Queue delivery is at-least-once. summary_queue_jobs prevents duplicate summary sends/replies across retries and overlapping consumers.
  • Ownership-safe completion/release:
    • Claims include ownership fields (lease_until, updated_at).
    • Done/release operations are conditional on the same ownership fields.
  • Retries:
    • in_flight claim: message retries near lease expiry.
    • ai_error on daily jobs: retry with longer delay.
    • transient processing/send/DB failures: retry with default delay.
  • Non-retry outcomes:
    • Daily no_messages, no_text, degraded: ack silently (no chat message).
    • On-demand no_messages, no_text, degraded: send user-facing reply, then ack.
  • Queue v1 currently has no DLQ configured.

Summary Persistence

  • Successful summaries are persisted in D1 summaries.
  • /status reports Stored summaries as the count of persisted summary rows.
  • Persistence is history/audit/troubleshooting oriented. It is not currently a general performance cache mechanism.

Forwarded Message Attribution

  • For regular messages, sender attribution comes from message.from.
  • For forwarded messages, sender attribution is overridden only when forward_origin.type is user (using forward_origin.sender_user).
  • For all other forward-origin types (hidden_user, chat, channel) and automatic forwards, attribution remains the forwarding user (message.from).

Tradeoffs

  • Strong simplicity and low cost over advanced configurability.
  • Self-host/operator model over shared multi-tenant service.
  • Fixed-window rate limiting over more complex token-bucket or adaptive models.
  • Prompt-driven formatting over deterministic template-only summarization.

Self-Hosting Model

  • This project is designed for self-hosting, not as a shared public SaaS bot.
  • Each operator should deploy and manage their own Worker instance.
  • The operator is expected to understand basic Telegram Bot + Cloudflare Worker operations and to actively participate in chats where the bot is installed.
  • This deployment can restrict usage with TELEGRAM_ALLOWED_CHAT_IDS.

Stack

  • Cloudflare Workers
  • Cloudflare D1
  • Cloudflare Workers AI
  • Telegram Bot API
  • TypeScript + pnpm

Prerequisites

  • Node.js 22+
  • pnpm
  • Cloudflare account and wrangler access
  • Telegram bot token from BotFather

Local Setup

  1. Install dependencies:

    pnpm install
  2. Configure local env vars in .dev.vars (example):

    TELEGRAM_BOT_TOKEN=...
    TELEGRAM_WEBHOOK_SECRET=...
    TELEGRAM_ALLOWED_CHAT_IDS=-1001234567890
    PROJECT_REPO_URL=https://github.com/escape0707/telegram-summary-bot
  3. Run local worker:

    pnpm dev

Deploy

  1. Deploy worker:

    pnpm run deploy
  2. Configure webhook and bot commands:

    TELEGRAM_BOT_TOKEN='<bot_token>' \
    TELEGRAM_WEBHOOK_URL='https://<your-worker-domain>/telegram' \
    TELEGRAM_WEBHOOK_SECRET='<webhook_secret>' \
    pnpm run telegram:setup

Optional:

  • TELEGRAM_ALLOWED_UPDATES='message,edited_message'
  • TELEGRAM_DROP_PENDING_UPDATES='true'

Allowlist and Onboarding

  • TELEGRAM_ALLOWED_CHAT_IDS controls which chat IDs can use this deployment.

  • Format: comma-separated numeric IDs, for example: -1001234567890,-1009876543210.

  • Recommended production setup:

    wrangler secret put TELEGRAM_ALLOWED_CHAT_IDS
  • Non-allowlisted chat commands receive a self-host guidance reply with the current chat.id, so users can self-host their own instance without log inspection.

  • /help and /start provide onboarding guidance in DMs only.

  • In groups, /help and /start are intentionally ignored to reduce noise.

Scripts

  • pnpm dev: local worker with test-scheduled support.
  • pnpm run deploy: deploy worker.
  • pnpm test: Vitest watch mode.
  • pnpm run test:run: run Node-based unit tests (*.test.ts).
  • pnpm run test:workers: run Workers integration tests (*.worker.test.ts).
  • pnpm run test:coverage: run unit tests with coverage report.
  • pnpm run typecheck: run TypeScript compiler checks with tsc.
  • pnpm run format: format files with Prettier.
  • pnpm run format:check: check formatting in CI/local.
  • pnpm run lint: run ESLint.
  • pnpm run lint:fix: run ESLint autofixes.
  • pnpm run lint:md: lint Markdown docs.
  • pnpm cf-typegen: regenerate Cloudflare types.
  • pnpm run telegram:setup: call Telegram setWebhook + setMyCommands.

Testing

  • Unit tests run in Node with vitest.config.ts and target *.test.ts.
  • Workers integration tests run in local workerd/Miniflare with vitest.workers.config.ts and target *.worker.test.ts.
  • In restricted sandbox environments, pnpm run test:workers may require elevated permissions because workerd starts isolated runtimes.
  • src/db/rateLimits.worker.test.ts intentionally uses prepare(...).run()/DELETE setup instead of DB.exec() due workers-sdk issue #11999.

Rate Limiting

  • Applies to /summary and /summaryday only.
  • Uses fixed 10-minute windows.
  • Default limits:
    • Per-user-in-chat: 3 requests per 10 minutes.
    • Per-chat: 20 requests per 10 minutes.
  • Stale rate-limit rows are cleaned by daily cron in bounded batches.
  • Cleanup defaults:
    • Retention: 3 days (RATE_LIMIT_CLEANUP_RETENTION_SECONDS).
    • Batch size: 500 rows (RATE_LIMIT_CLEANUP_BATCH_SIZE).
    • Max batches per cron run: 20 (RATE_LIMIT_CLEANUP_MAX_BATCHES).
  • Tuning values live in src/config.ts.

Docs

  • Operational runbook: docs/ops.md
  • Project plan and progress: PLAN.md

Synthetic Benchmarks and Demo Safety

  • Keep benchmark runs separate from production usage metrics by recording them as synthetic_benchmark telemetry.
  • Use only synthetic/anonymized message datasets for benchmark and demo material.
  • For operator workflow and privacy checklist, see docs/ops.md ("Synthetic Benchmark Workflow" and "Privacy-Safe Demo Capture").

License

Licensed under the GNU Affero General Public License v3.0 (AGPL-3.0). See LICENSE.

Copyright

Copyright (C) 2026 Escape0707. See COPYRIGHT.

About

Self-hosted Telegram group summary bot on Cloudflare Workers + D1 + Workers AI, with on-demand/daily summaries, allowlist access control, rate limiting, and CI-tested quality.

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors