Skip to content

feat(origin): detect parent agent/IDE and propagate to HeyGen API + PostHog#171

Merged
jrusso1020 merged 1 commit into
mainfrom
feat/client-origin-detection
Jun 12, 2026
Merged

feat(origin): detect parent agent/IDE and propagate to HeyGen API + PostHog#171
jrusso1020 merged 1 commit into
mainfrom
feat/client-origin-detection

Conversation

@jrusso1020

Copy link
Copy Markdown
Collaborator

Description

Adds a new internal/origin package that detects which parent agent / IDE / managed sandbox launched the CLI, and wires the result into two outbound surfaces:

  1. X-HeyGen-Client-Origin: <origin> header on every HTTP request from the CLI (internal/client/client.go). The header is omitted when origin is unknown so the backend can distinguish unknown from explicit-empty. This matches the existing self-declared-trust pattern of X-HeyGen-Client-User-Agent.
  2. client_origin property on every PostHog event from the CLI (internal/analytics/analytics.go). Same omit-when-empty rule so funnel queries get clean cohorts.

Detect() is called once at startup from client.New and analytics.newWithCapture and the result is cached for the process lifetime — parent context doesn't change mid-invocation.

Why

The backend's compute_request_source label scheme currently produces cli / cli:api_key for every CLI invocation, regardless of whether it came from a developer in claude_code, cursor, a CI runner, or a plain shell. User-agent matching server-side is fragile and our internal CLI ships a generic UA. A self-declared client header is honest about the trust model and lets the backend follow up with cli:claude_code, cli:cursor, etc., falling back to today's behavior when the header is absent.

This is the CLI half of the change. A follow-up backend PR (stacked on the in-flight one that just dropped server-side cli provider-qualification) will read the header.

Detection rules

Ported from hyperframes-oss packages/cli/src/telemetry/agent_runtime.ts so the two CLIs stay in lockstep on which agent names exist and how each is identified. Existence checks only — we never read the value of an env var (some are secrets).

Origin Trigger
claude_code CLAUDECODE or CLAUDE_CODE_ENTRYPOINT
codex CODEX_THREAD_ID or CODEX_CI or CODEX_SANDBOX_NETWORK_DISABLED
cursor TERM_PROGRAM=cursor (exact)
windsurf TERM_PROGRAM ~ windsurf (case-insensitive)
copilot_agent GITHUB_ACTIONS=true AND (COPILOT_AGENT_ID set OR RUNNER_NAME=Copilot)
replit REPL_ID or REPLIT_USER
hermes HERMES_QUIET
openclaw OPENCLAW_STATE_DIR or OPENCLAW_CONFIG_PATH
pi PI_CODING_AGENT
cline CLINE_ACTIVE
gemini_cli GEMINI_CLI
crush CRUSH
gemini_managed_agent /.agents/ is a directory AND gVisor kernel detected (4.19.0-gvisor osrelease or gVisor in /proc/version)

First match wins; ordering pinned by test (TestDetect_FirstMatchWins).

Testing

  • go test ./internal/origin/... — 25 subcases covering every vendor rule, empty-env baseline, case-sensitivity (cursor exact vs windsurf case-insensitive), first-match-wins ordering, Copilot's two-marker conjunction, and the non-Linux short-circuit on isGeminiManagedAgent.
  • go test ./internal/client/... — added TestClient_Do_SetsClientOrigin (header present when origin set) and TestClient_Do_OmitsClientOriginWhenEmpty (no header at all when origin empty — http.Header map key absent, not empty string value).
  • go test ./internal/analytics/... — added TestCommandRun_IncludesClientOrigin, TestCommandRunComplete_IncludesClientOrigin, and TestCommandRun_OmitsClientOriginWhenEmpty.
  • go test ./... — full suite green.
  • make build — builds locally.
  • golangci-lint run ./... (v2.11.4, matching CI) — 0 issues.

Follow-up: backend PR will read X-HeyGen-Client-Origin from compute_request_source to produce cli:<origin> labels.

PR by Jerrai

…ytics

Adds internal/origin: a port of the hyperframes-oss agent-runtime detector
(13 vendor rules + gemini_managed_agent filesystem detection). Existence-
based env-var checks; never reads secret values. Detect() is called once
at startup from internal/client.New and internal/analytics.newWithCapture
and the result is cached for the process lifetime.

Wiring:
- internal/client.Do() sets `X-HeyGen-Client-Origin: <origin>` on every
  outbound request when origin is non-empty (header is omitted when
  unknown so the backend can distinguish "unknown" from explicit-empty).
- internal/analytics CommandRun / CommandRunComplete attach
  `client_origin` to PostHog events. Empty origin → property is omitted
  (same separation rationale).

Why: the backend's compute_request_source label scheme can't currently
distinguish a CLI invocation from claude_code vs cursor vs a plain shell
— user-agent matching is fragile and our internal CLI sets a generic UA.
A self-declared client header matches the existing X-HeyGen-Client-User-Agent
trust pattern and a follow-up backend PR will key on it to yield
`cli:claude_code`, `cli:cursor`, etc., falling back to today's behavior.

Detection: any of CLAUDECODE, CLAUDE_CODE_ENTRYPOINT, CODEX_THREAD_ID,
CODEX_CI, CODEX_SANDBOX_NETWORK_DISABLED, TERM_PROGRAM=cursor (exact)
or windsurf (case-insensitive), GITHUB_ACTIONS=true paired with
COPILOT_AGENT_ID or RUNNER_NAME=Copilot, REPL_ID / REPLIT_USER,
HERMES_QUIET, OPENCLAW_STATE_DIR / OPENCLAW_CONFIG_PATH, PI_CODING_AGENT,
CLINE_ACTIVE, GEMINI_CLI, CRUSH. gemini_managed_agent requires both
/.agents/ as a directory AND gVisor kernel — neither alone is unique.
First match wins; ordering pinned by test.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@jrusso1020 jrusso1020 merged commit 55f5a38 into main Jun 12, 2026
9 checks passed
@jrusso1020 jrusso1020 deleted the feat/client-origin-detection branch June 12, 2026 23:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants