fix(errors): classify network errors as exit code 4#4
Conversation
Mixpanel telemetry currently buckets every API-call failure as error_class=internal because the only command paths that wrapped errors returned *output.InternalError unconditionally. That made the "internal" bucket impossible to read — a transient DNS timeout looked identical to a real CLI bug. Add a NetworkError type (exit code 4 = network) and an IsNetworkErr helper that detects the standard transport-level failures from net/http: net.Error.Timeout(), *net.DNSError, *net.OpError, *url.Error chains, and ECONNREFUSED / ECONNRESET / EHOSTUNREACH / ENETUNREACH / EPIPE syscalls. ClassifyError now applies the same detection to raw errors that bubble up unwrapped, so commands that don't explicitly wrap their errors still classify correctly. The two known sites that wrap a network call (auth.go and hello.go, both validating credentials via client.ListProjects) now return NetworkError when the cause is connectivity-related. Successful HTTP exchanges that returned non-2xx status (e.g. 401, 5xx) are unchanged and still flow through the existing AuthError / InternalError paths. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Caution Review failedPull request was closed or merged during review WalkthroughThe PR adds network error detection across the application. A new Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~15 minutes 🚥 Pre-merge checks | ✅ 4✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: a7eb29667e
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| if output.IsNetworkErr(err) { | ||
| return &output.NetworkError{Message: "validate credentials", Cause: err} |
There was a problem hiding this comment.
Map network failures to
network_error JSON code
Returning &output.NetworkError{...} here causes ErrorResponseFromErr to emit code: "error" in JSON mode because that switch only handles UserError, AuthError, and InternalError (see internal/output/breadcrumbs.go, ErrorResponseFromErr). This regresses machine-readable semantics for these login network failures: clients no longer get the documented network_error code and retryable may be false for non-timeout transport errors like connection refused.
Useful? React with 👍 / 👎.
Summary
output.NetworkError(exit code 4) andoutput.IsNetworkErrhelpernet.Error.Timeout(),*net.DNSError,*net.OpError,*url.Errorchains, andECONNREFUSED/ECONNRESET/EHOSTUNREACH/ENETUNREACH/EPIPEsyscallsClassifyErrornow applies the same detection to raw errors that bubble up unwrappedauth loginandhelloboth validating credentials viaclient.ListProjects) now returnNetworkErrorwhen the cause is connectivity-relatedWhy
Mixpanel telemetry currently buckets every API-call failure as
error_class=internalbecause every command path wrapped errors as*output.InternalErrorunconditionally. That made the "internal" bucket unreadable — a transient DNS timeout looked identical to a real CLI bug. Today's data:auth loginshows 6 `internal` failures in 30 days from 4 users on 4 different versions, no obvious server-side cause. With this change, those will split between trueinternal(CLI bugs) andnetwork(transport failures), so the "internal" bucket becomes a real signal worth investigating.Scope
internal/output/errors.goauth.goandhello.goopt into itmcp.go,setup.go,config.go, etc.) wrap genuinely-local operations (file I/O, keyring, term reads) — they're correctlyinternaland don't need to changeClassifyError's fallback detection means any future API-call site that doesn't explicitly classify will still pick upnetworkautomatically when applicableTest plan
go test ./...— greengo vet ./...— cleangolangci-lint run(latest) — 0 issues*net.DNSError,*net.OpError,net.Error.Timeout(),*url.Errorwrapping a network cause,*url.Errorwrapping a non-network cause (negative case), andClassifyErrorpicking up a raw*net.OpErrorand*url.ErrorchainNotes
This pairs with #3 (sanitized error_message in telemetry) — together they give us "which class" + "what specifically" for every failure.
🤖 Generated with Claude Code
Summary by CodeRabbit
Release Notes