Skip to content

vasujain/modshield

Repository files navigation

ModShield: Incident Desk

A Reddit Devvit Web app that turns scattered queue chaos into actionable incidents — spam waves, harassment pile-ons, rule-violation clusters, duplicate bursts, and report storms — and lets moderators batch-review and batch-resolve them in one place.

ModShield is deterministic, explainable, and human-in-the-loop. It uses heuristic scoring (no LLMs), shows you exactly which signals fired, and never runs a destructive moderation action without a moderator clicking through a confirmation.

What it does

  • Watches onPostSubmit, onCommentSubmit, post/comment reports, and Automoderator filter events.
  • Runs each item through five scorers: Spam Wave, Harassment Pile-on, Rule Violation Cluster, Duplicate / Repost Burst, and Report Storm.
  • Picks the highest-scoring incident type and either ignores the item (below the configured strictness threshold) or merges it into an existing open incident with the same primary post / domain / fingerprint.
  • Surfaces a polished React dashboard with risk badges, signal pills, an evidence summary, an item review table, and a dry-run resolve flow.
  • Records every detection and every moderator action in an audit log.

Key features

  • Dashboard — active incidents, today's metrics (items analyzed, resolved, time saved), seed/reset demo controls.
  • Incident Detail — evidence panel, per-item risk + signals, action override per item, dry-run preview, confirm-before-apply modal, audit timeline.
  • Settings — strictness, watchlists (spam domains, spam keywords, toxic keywords, watched phrases), required post fields, required title prefixes, velocity / merge windows.
  • History — resolved incidents and time-saved totals.
  • Demo seed — generates realistic Spam Wave / Harassment Pile-on / Rule Violation Cluster / Duplicate Burst incidents in Redis only, with no real Reddit content touched.

Safety model

  • autoActionsEnabled defaults to OFF. ModShield never removes, approves, replies, flairs, or locks without a moderator clicking Apply through the UI.
  • Every Resolve flow has a dry-run preview that shows exactly what would happen on Reddit before any real action.
  • Per-item failures are isolated — a bad action on one item does not stop the rest of the batch.
  • Every detection, every settings change, and every action (real or dry-run) is written to the audit log.
  • No external LLM, no external network fetch. Reddit app review treats premium capabilities as opt-in; ModShield's MVP avoids them entirely.

Tech stack

  • Devvit Web (@devvit/web) — Redis, Reddit API, triggers, scheduler, menu
  • React 18 + TypeScript on the client (Vite-built)
  • Pure-TypeScript server with a thin in-house router so handlers are unit-testable with an in-memory Redis
  • Vitest for tests

Project layout

src/
  shared/             types, normalize, fingerprint, domains, risk levels
  server/
    storage/          Redis repos: settings, incidents, items, audit, metrics, windows
    scoring/          one scorer per incident type
    engine/           processThing(), incident merge, suggested-action picker
    moderation/       resolve flow, RedditClientLike interface
    routes/           api / triggers / cron / menu, plus the router
    demo/             seed + reset
    devvitAdapters.ts wraps Devvit's RedisClient into our RedisLike interface
    index.ts          Devvit Web server entry — Node HTTP handler
  client/
    pages/            Dashboard, IncidentDetail, Settings, History
    components/       cards, badges, table, settings form, audit timeline, demo panel
    api.ts, App.tsx, router.tsx, main.tsx, styles.css
tests/                Vitest specs for scoring, engine, storage, router, resolve, demo

Run it

Requires Node 22.2+ for Devvit playtest (older Node works for typecheck and unit tests).

npm install
npm run typecheck       # tsc --noEmit
npm test                # vitest run
npm run build:client    # vite build into ./public
npm run dev             # devvit playtest (requires Devvit CLI login)

npm run dev runs devvit playtest which installs the app in a development subreddit and streams logs as code changes.

Demo flow

  1. Open the app post on your test subreddit.
  2. Click Seed demo data. Four scenarios are written to Redis with no touch on real Reddit content.
  3. Open one of the resulting incidents. Pre-selected high-risk items are ready to act on.
  4. Click Dry-run resolve to preview every action that would be taken on Reddit.
  5. Click Apply to Reddit to actually run the actions through the moderator confirmation modal. (In the in-memory tests this is mocked.)
  6. Watch the time-saved metric tick up on the dashboard and the audit log record every step.

Acceptance checklist

  • npm run typecheck passes ✓
  • npm test — 48 tests across 8 files pass ✓
  • npm run build:client — Vite builds the React app into ./public
  • Empty-state dashboard renders correctly when Redis is fresh
  • Demo seed creates incidents and resolve dry-run works
  • Settings load + save round-trips
  • Trigger handlers parse mocked payloads and emit incidents
  • No external LLM/API or network fetch in MVP

Known limitations

  • Heuristic scoring only — no semantic toxicity model. Toxic-language detection is keyword-based and configurable.
  • Rolling-window counters are per-installation; sister-subreddit signals are not shared.
  • Suggested actions are conservative by design. The MVP never escalates to bans or lockouts; that's a deliberate post-MVP feature gated behind app review.
  • Lock thread support depends on the type of the target item; the MVP wires the call through but does not expose a primary lock-thread UI.

Future improvements

  • Optional, opt-in AI summary / explainer mode (after Devvit app review for premium capabilities).
  • Cross-incident mod-rep memory (repeat offender tracking with configurable retention).
  • Slack / modmail digest of new high-confidence incidents.
  • Community-sourced spam/toxic lists shared between participating subreddits.
  • Per-flair rule presets (e.g. specific required fields by flair).

Submission notes

  • App listing copy and 60-second video script live alongside this repo under docs/ once recorded.
  • Privacy notes: ModShield only stores Reddit thing IDs, post/comment excerpts that the moderator already has access to, and aggregate counters. Demo data is namespaced and POST /api/demo/reset clears it cleanly.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors