Skip to content

feat(telemetry): anonymous workflow-invocation tracking via PostHog #1261

@coleam00

Description

@coleam00

Problem

We have zero visibility into how Archon is actually used in the wild. We don't know which bundled workflows get real usage, whether users are writing their own, or which user-facing features to invest in next. A lightweight, anonymous usage signal would unblock those product decisions without compromising the single-developer-tool ethos.

  • What problem: No data on workflow adoption — we can't tell whether archon-assist, archon-implement, archon-plan etc. are equally used, or whether any are dead weight. We also have no signal on how many users write their own workflows and what they call them.
  • Who experiences it: Maintainers (product decisions), and by extension all users (better prioritization, less cruft).
  • How often: Continuously — every roadmap decision today is made blind.

Proposed Solution

Integrate PostHog (posthog-node) for anonymous, opt-out-able telemetry. Emit one eventworkflow_invoked — every time executeWorkflow() kicks off, regardless of trigger source (CLI, web, chat adapters, GitHub @mentions).

Event properties:

Property Value Source
workflow_name e.g. archon-implement, my-custom-flow workflow.name
workflow_description First ~200 chars of the workflow description workflow.description
workflow_source bundled | project | global Threaded from discovery
platform cli | web | slack | telegram | discord | github | gitlab | gitea platform.getPlatformType()
archon_version Semver from root package.json Build-time or runtime read
$process_person_profile false Keeps events in the anonymous tier (no PostHog person profile created)

Distinct ID: A stable anonymous UUID generated once on first run and persisted to ~/.archon/telemetry-id (or equivalent in ARCHON_HOME). Lets us count distinct installs and correlate workflow usage per-install, without any user identity. The ID never touches posthog.identify() so events stay in PostHog's cheaper anonymous tier.

No PII. Ever. No git remotes, no file paths, no usernames, no prompts/messages, no tokens. Workflow names and descriptions authored by the user are the only free-form strings, and description is truncated.

Research: Implementation Blueprint

Integration chokepoint

All invocation paths converge on a single function — executeWorkflow() in packages/workflows/src/executor.ts:230. There are three direct call sites:

  1. CLI: packages/cli/src/commands/workflow.ts:613
  2. Orchestrator foreground/interactive/resume: packages/core/src/orchestrator/orchestrator-agent.ts:276, 288, 314
  3. Orchestrator background (web async): packages/core/src/orchestrator/orchestrator.ts:365

At executor.ts:618-623 the engine already fires a typed workflow_started event on the singleton WorkflowEventEmitter (packages/workflows/src/event-emitter.ts:27-32). That's the right hook site — subscribe once at server/CLI startup and call posthog.capture() inside the listener. Existing subscribers (the web SSE adapter) prove the pattern scales and stays error-isolated.

Data already in scope at the hook

Field In scope? Notes
workflow.name Yes Required field
workflow.description Yes Required field, z.string().min(1)
workflowRun.id Yes Generated by engine
platform.getPlatformType() Yes Existing method on IWorkflowPlatform
workflow.source (bundled/project) No WorkflowSource is carried by WorkflowWithSource during discovery (packages/workflows/src/schemas/workflow.ts:96-102) but stripped at all three call sites — they pass only WorkflowDefinition

Implementation note: thread source: WorkflowSource through as an optional parameter to executeWorkflow() and extend WorkflowStartedEvent with an optional source field. All three call sites already have the WorkflowWithSource in scope (they do .map(ws => ws.workflow) to drop it); pass ws.source through instead.

SDK choice

  • posthog-node (server-side; v5.29.2). posthog-js is the browser SDK and does not belong in server code.
  • Bun-compatible — PostHog docs explicitly list bun add posthog-node as a supported install path.

Anonymous tracking pattern

PostHog requires a distinctId. For tool-analytics with no user identity:

  • Generate a UUID on first run, persist to ~/.archon/telemetry-id.
  • Send $process_person_profile: false on every event — this puts events in PostHog's anonymous-tier table (no person joins, up to 4x cheaper, and no person profile is ever created).
  • Never call posthog.identify() on that ID.

Opt-out

Support three mechanisms, in priority order:

  1. Env var: ARCHON_TELEMETRY_DISABLED=1 (Archon-specific)
  2. Env var: DO_NOT_TRACK=1 (de facto standard honored by Astro/Bun/Prisma/Nuxt/Expo)
  3. Config: telemetry.enabled: false in ~/.archon/config.yaml or repo .archon/config.yaml

Implementation: if any opt-out is set, call await posthog.disable() at init. This makes all subsequent capture() calls silent no-ops — no conditional guards needed throughout the codebase.

Lifecycle

  • Server mode: hook posthog.shutdown() on SIGTERM / SIGINT.
  • CLI mode: each CLI command calls await posthog.shutdown() at the end, or uses captureImmediate() to guarantee flush before process exit. Short-lived processes lose buffered events if they exit without flushing.
  • Attach posthog.on('error', ...) → log-and-swallow. Telemetry must never crash the app.

PostHog API key handling

phc_* is a public write-only key — safe to embed in source / distributed binaries. The OSS pattern is: maintainer bakes their own phc_* into a constant (ideally read from process.env.POSTHOG_API_KEY at build time, falling back to the embedded default). If the env var is unset and no key is baked in, telemetry self-disables silently.

Suggested module shape

New thin module (no new workspace package needed):

  • packages/core/src/services/telemetry.ts — singleton client, captureWorkflowInvoked() helper, opt-out handling, shutdown() export.
  • Subscribe to WorkflowEventEmitter once in packages/server/src/index.ts (server) and once in packages/cli/src/cli.ts (CLI).
  • posthog-node added to packages/core/package.json.

User Flow

Before (current)

Maintainer: "Which bundled workflow gets used the most?"
[!] No idea. Best we can do is count GitHub stars.

After (proposed)

[+] Every workflow invocation -> posthog.capture({ event: 'workflow_invoked', ... })
[+] PostHog UI: Trends chart filtered by workflow_name, grouped by workflow_source
    -> "archon-implement: 1,240 runs / 60% bundled / 40% project-defined this week"
[+] Users who opt out (ARCHON_TELEMETRY_DISABLED=1 or DO_NOT_TRACK=1): silent no-op

Alternatives Considered

Alternative Pros Cons Why not chosen
Self-hosted Plausible / Umami Privacy-first, full control Extra infra to run; weaker event-analytics UX Overkill for single-developer tool; PostHog has a generous free tier
Segment + multiple destinations Flexibility Another dependency layer; cost YAGNI — we have one destination
Custom endpoint (post to archon-api) No third-party We'd be building analytics infra Distracts from product work
Instrument every lifecycle event (workflow_completed, node_started, ...) More data Noise, cost, analysis paralysis Start with one event; expand only if a question demands it
Store source at capture time via re-discovery Less threading Couples telemetry to the discovery layer Threading through executeWorkflow is trivial and keeps boundaries clean

Scope

  • Package(s) likely affected: core (telemetry module + config), workflows (thread source param + extend event type), server (startup subscribe + SIGTERM flush), cli (startup subscribe + shutdown-on-exit), paths (optional — anonymous ID helper could live next to archon-paths.ts)
  • Breaking change? No
  • Database changes needed? No
  • New external dependencies? Yesposthog-node (one runtime dep)

Security Considerations

  • New permissions/capabilities? No — no new permissions needed on the user's machine.
  • New external network calls? Yes — outbound HTTPS to us.i.posthog.com (or EU). Batched, async, non-blocking. Fully opt-out-able.
  • Secrets/tokens handling? No — the phc_* key is public-write-only; safe to embed.
  • PII audit: confirm at code review that nothing downstream adds PII. Only workflow_name + workflow_description are free-form strings, both authored by the user.
  • Anonymous-tier guarantee: the chosen distinctId must never be passed to posthog.identify() — add a lint/comment guard.
  • Doc update: add a "Telemetry" section to the README explaining what is collected, how to opt out, and linking to this issue.

Definition of Done

  • posthog-node added as a dependency; telemetry module initializes a singleton client on startup (server + CLI)
  • workflow_invoked event fires from the WorkflowEventEmitter.workflow_started subscription with all six properties listed above
  • WorkflowStartedEvent extended with optional source; executeWorkflow() threads source from all three call sites
  • Anonymous install UUID generated and persisted to ~/.archon/telemetry-id on first run; stable across restarts
  • Opt-out via ARCHON_TELEMETRY_DISABLED=1, DO_NOT_TRACK=1, or telemetry.enabled: false in config
  • posthog.shutdown() hooked on SIGTERM / SIGINT for server and called at end of each CLI command
  • posthog.on('error', ...) handler logs-and-swallows; no telemetry failure can crash Archon
  • README / docs section documenting what's collected and how to opt out
  • Tests: unit test for the event-emitter subscription path verifying capture() is called with the right properties; unit test for opt-out env vars
  • Manual validation: run one workflow end-to-end via CLI, web, and a chat adapter; confirm events appear in PostHog Live Events view with expected properties

References

Metadata

Metadata

Assignees

No one assigned

    Labels

    P2Medium priority - Backlog, when time permitsarea: infraDocker, deployment, CI/CDarea: workflowsWorkflow engineeffort/mediumFew files, one domain or module, some coordination neededfeature-requestNew functionality (external suggestion, needs review)

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions