Telegram Summary Bot

A Cloudflare Worker bot that ingests Telegram group messages and produces on-demand and daily AI summaries.

Features

Store non-command group messages in D1.
/summary [Nh [Mh]] for custom window summaries.
/summaryday alias for last 24h.
/status for service snapshot and split real/synthetic summary metrics.
Daily cron summary dispatch (08:00 UTC / 17:00 JST).
Summary command rate limiting (per-user-in-chat and per-chat fixed windows).
GitHub Actions CI (typecheck) and manual CD workflow.

Architecture

Ingress: Telegram sends updates to Worker webhook (/telegram), which validates secret headers, parses commands, enqueues summary jobs, and ingests group messages.
Storage: D1 stores raw messages, summaries, service stats, and rate-limit counters, plus queue-idempotency claim records.
Summarization: shared application pipeline loads windowed messages and calls Workers AI for clustered summaries in queue consumers.
Scheduling: Cloudflare Cron enqueues daily summary jobs at 08:00 UTC.
Queue: Cloudflare Queues (summary-jobs) processes one job per batch (max_batch_size=1, max_batch_timeout=0) to minimize delivery latency.
Ops: tracked wrappers persist success/error state for /status and incident debugging.

Queue-Based Summary Flow

/summary and /summaryday in allowlisted groups are enqueue-only:
- Webhook applies rate limiting.
- Webhook enqueues an on-demand summary job.
- Webhook returns 200 immediately (no interim "processing" reply).
- The queue consumer posts the actual reply when processing completes.
Daily cron is enqueue-only:
- Loads active chats in the last 24h.
- Skips non-allowlisted chats.
- Enqueues one daily summary job per target chat.
Queue consumer behavior:
- Daily jobs: summarize and post to chat.
- On-demand jobs: summarize and reply to the source command message.
- Delivery is at-least-once; processing is protected with D1 claim idempotency.

Queue Retry and Failure Semantics

Queue delivery is at-least-once. summary_queue_jobs prevents duplicate summary sends/replies across retries and overlapping consumers.
Ownership-safe completion/release:
- Claims include ownership fields (lease_until, updated_at).
- Done/release operations are conditional on the same ownership fields.
Retries:
- in_flight claim: message retries near lease expiry.
- ai_error on daily jobs: retry with longer delay.
- transient processing/send/DB failures: retry with default delay.
Non-retry outcomes:
- Daily no_messages, no_text, degraded: ack silently (no chat message).
- On-demand no_messages, no_text, degraded: send user-facing reply, then ack.
Queue v1 currently has no DLQ configured.

Summary Persistence

Successful summaries are persisted in D1 summaries.
/status reports Stored summaries as the count of persisted summary rows.
Persistence is history/audit/troubleshooting oriented. It is not currently a general performance cache mechanism.

Forwarded Message Attribution

For regular messages, sender attribution comes from message.from.
For forwarded messages, sender attribution is overridden only when forward_origin.type is user (using forward_origin.sender_user).
For all other forward-origin types (hidden_user, chat, channel) and automatic forwards, attribution remains the forwarding user (message.from).

Tradeoffs

Strong simplicity and low cost over advanced configurability.
Self-host/operator model over shared multi-tenant service.
Fixed-window rate limiting over more complex token-bucket or adaptive models.
Prompt-driven formatting over deterministic template-only summarization.

Self-Hosting Model

This project is designed for self-hosting, not as a shared public SaaS bot.
Each operator should deploy and manage their own Worker instance.
The operator is expected to understand basic Telegram Bot + Cloudflare Worker operations and to actively participate in chats where the bot is installed.
This deployment can restrict usage with TELEGRAM_ALLOWED_CHAT_IDS.

Stack

Cloudflare Workers
Cloudflare D1
Cloudflare Workers AI
Telegram Bot API
TypeScript + pnpm

Prerequisites

Node.js 22+
pnpm
Cloudflare account and wrangler access
Telegram bot token from BotFather

Local Setup

Install dependencies:
```
pnpm install
```

Configure local env vars in .dev.vars (example):

TELEGRAM_BOT_TOKEN=...
TELEGRAM_WEBHOOK_SECRET=...
TELEGRAM_ALLOWED_CHAT_IDS=-1001234567890
PROJECT_REPO_URL=https://github.com/escape0707/telegram-summary-bot

Run local worker:
```
pnpm dev
```

Deploy

Deploy worker:
```
pnpm run deploy
```

Configure webhook and bot commands:

TELEGRAM_BOT_TOKEN='<bot_token>' \
TELEGRAM_WEBHOOK_URL='https://<your-worker-domain>/telegram' \
TELEGRAM_WEBHOOK_SECRET='<webhook_secret>' \
pnpm run telegram:setup

Optional:

TELEGRAM_ALLOWED_UPDATES='message,edited_message'
TELEGRAM_DROP_PENDING_UPDATES='true'

Allowlist and Onboarding

TELEGRAM_ALLOWED_CHAT_IDS controls which chat IDs can use this deployment.
Format: comma-separated numeric IDs, for example: -1001234567890,-1009876543210.

Recommended production setup:

wrangler secret put TELEGRAM_ALLOWED_CHAT_IDS

Non-allowlisted chat commands receive a self-host guidance reply with the current chat.id, so users can self-host their own instance without log inspection.
/help and /start provide onboarding guidance in DMs only.
In groups, /help and /start are intentionally ignored to reduce noise.

Scripts

pnpm dev: local worker with test-scheduled support.
pnpm run deploy: deploy worker.
pnpm test: Vitest watch mode.
pnpm run test:run: run Node-based unit tests (*.test.ts).
pnpm run test:workers: run Workers integration tests (*.worker.test.ts).
pnpm run test:coverage: run unit tests with coverage report.
pnpm run typecheck: run TypeScript compiler checks with tsc.
pnpm run format: format files with Prettier.
pnpm run format:check: check formatting in CI/local.
pnpm run lint: run ESLint.
pnpm run lint:fix: run ESLint autofixes.
pnpm run lint:md: lint Markdown docs.
pnpm cf-typegen: regenerate Cloudflare types.
pnpm run telegram:setup: call Telegram setWebhook + setMyCommands.

Testing

Unit tests run in Node with vitest.config.ts and target *.test.ts.
Workers integration tests run in local workerd/Miniflare with vitest.workers.config.ts and target *.worker.test.ts.
In restricted sandbox environments, pnpm run test:workers may require elevated permissions because workerd starts isolated runtimes.
src/db/rateLimits.worker.test.ts intentionally uses prepare(...).run()/DELETE setup instead of DB.exec() due workers-sdk issue #11999.

Rate Limiting

Applies to /summary and /summaryday only.
Uses fixed 10-minute windows.
Default limits:
- Per-user-in-chat: 3 requests per 10 minutes.
- Per-chat: 20 requests per 10 minutes.
Stale rate-limit rows are cleaned by daily cron in bounded batches.
Cleanup defaults:
- Retention: 3 days (RATE_LIMIT_CLEANUP_RETENTION_SECONDS).
- Batch size: 500 rows (RATE_LIMIT_CLEANUP_BATCH_SIZE).
- Max batches per cron run: 20 (RATE_LIMIT_CLEANUP_MAX_BATCHES).
Tuning values live in src/config.ts.

Docs

Operational runbook: docs/ops.md
Project plan and progress: PLAN.md

Synthetic Benchmarks and Demo Safety

Keep benchmark runs separate from production usage metrics by recording them as synthetic_benchmark telemetry.
Use only synthetic/anonymized message datasets for benchmark and demo material.
For operator workflow and privacy checklist, see docs/ops.md ("Synthetic Benchmark Workflow" and "Privacy-Safe Demo Capture").

License

Licensed under the GNU Affero General Public License v3.0 (AGPL-3.0). See LICENSE.

Name		Name	Last commit message	Last commit date
Latest commit History 101 Commits
.github/workflows		.github/workflows
.vscode		.vscode
docs		docs
migrations		migrations
scripts		scripts
src		src
.gitignore		.gitignore
.prettierignore		.prettierignore
CHANGELOG.md		CHANGELOG.md
CONTRIBUTING.md		CONTRIBUTING.md
COPYRIGHT		COPYRIGHT
LICENSE		LICENSE
PLAN.md		PLAN.md
README.md		README.md
eslint.config.mjs		eslint.config.mjs
package.json		package.json
pnpm-lock.yaml		pnpm-lock.yaml
pnpm-workspace.yaml		pnpm-workspace.yaml
tsconfig.json		tsconfig.json
vitest.config.ts		vitest.config.ts
vitest.workers.config.ts		vitest.workers.config.ts
worker-configuration.d.ts		worker-configuration.d.ts
wrangler.jsonc		wrangler.jsonc

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Telegram Summary Bot

Features

Architecture

Queue-Based Summary Flow

Queue Retry and Failure Semantics

Summary Persistence

Forwarded Message Attribution

Tradeoffs

Self-Hosting Model

Stack

Prerequisites

Local Setup

Deploy

Allowlist and Onboarding

Scripts

Testing

Rate Limiting

Docs

Synthetic Benchmarks and Demo Safety

License

Copyright

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Telegram Summary Bot

Features

Architecture

Queue-Based Summary Flow

Queue Retry and Failure Semantics

Summary Persistence

Forwarded Message Attribution

Tradeoffs

Self-Hosting Model

Stack

Prerequisites

Local Setup

Deploy

Allowlist and Onboarding

Scripts

Testing

Rate Limiting

Docs

Synthetic Benchmarks and Demo Safety

License

Copyright

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages