Skip to content

fix(wacli-keepalive): defer initial backfill until --follow is connected#138

Merged
auroracapital merged 6 commits intomainfrom
fix/wacli-persistent-connection
Apr 18, 2026
Merged

fix(wacli-keepalive): defer initial backfill until --follow is connected#138
auroracapital merged 6 commits intomainfrom
fix/wacli-persistent-connection

Conversation

@auroracapital
Copy link
Copy Markdown
Collaborator

@auroracapital auroracapital commented Apr 18, 2026

Summary

  • Removes the pre---follow backfill block (auto_detect_empty_chatsdetect_missed_messagesrun_backfill) that ran at lines 347–349 of claude-ops/scripts/wacli-keepalive.sh.
  • Sets LAST_BACKFILL_TIME=0 so the first iteration of the --follow supervisor loop fires periodic_backfill immediately — reusing the warm session via acquire_wacli_batch instead of opening dozens of fresh WebSockets.
  • Comments updated to document the ordering invariant.

Why

Symptom reported after a fresh daemon restart: wacli-sync marked disconnected, /ops:comms commands slow, daemon logs show HEALTH: wacli-sync pid=N is dead every ~90s, keepalive log flooded with:

failed to dial whatsapp web websocket: ... Get "https://web.whatsapp.com/ws/chat":
  dial tcp: lookup web.whatsapp.com: no such host

…or on the better-connected path:

BACKFILL: timed out waiting for on-demand history sync response

Root cause: each wacli history backfill --chat=<jid> invocation opens its own WhatsApp WebSocket. On startup the script queues 40–50 chats (auto_detect_empty_chats + detect_missed_messages) and immediately runs them serially through run_backfill — before wacli sync --follow has a chance to establish the persistent connection the rest of the system expects.

When that many subprocess handshakes fire in quick succession, WhatsApp's fresh-session rate limit trips. The failures arrive as DNS lookup errors (throttled resolver → no such host) or 30 s timeouts waiting for the on-demand history sync response. Either way, the script never reaches line 494 (SYNC: starting persistent sync --follow) for 5–20 minutes after every launchd restart, and user-facing wacli doctor reports connected:false the whole time.

The fix

periodic_backfill already encapsulates the correct ordering: pause --follow via acquire_wacli_batch → run the same three functions (auto_detect_empty_chats + detect_missed_messages + run_backfill + write_backfill_memory) → release. Forcing its first iteration to fire immediately (via LAST_BACKFILL_TIME=0) gives us the exact same behaviour without the pre---follow race — backfills run one at a time against the warm session, and the persistent connection comes up in seconds.

This also matches what the script's own header promises: "3. Run wacli sync --follow for persistent message streaming".

Diff stats

1 file changed, 15 insertions(+), 4 deletions(-)

Test plan

  • bash -n claude-ops/scripts/wacli-keepalive.sh — syntax OK
  • tests/test-no-secrets.sh — 11/11 pass
  • Manual: restart daemon (launchctl kickstart -k gui/$(id -u)/com.claude-ops.daemon) and confirm wacli doctor reports connected:true within ~30 s instead of 5+ min.
  • Manual: verify first periodic_backfill cycle runs within the first monitor-loop iteration after --follow starts (check LAST_BACKFILL_TIME log line and PERIODIC-BACKFILL: checking for chats needing backfill appearing near the top of wacli-keepalive.log).
  • Manual: after first periodic cycle, confirm subsequent intervals respect BACKFILL_INTERVAL (default 30 min).

Out of scope

  • Pre-existing SC2069 warning from shellcheck on line 177 (2>&1 > "$probe_log") — not touched here.
  • The daemon's ENSURE: step auto-reinstalling a stale com.claude-ops.wacli-keepalive plist from a prior plugin version — a separate issue.

Open in Devin Review

Note

Medium Risk
Moderate risk: changes request authentication headers for Anthropic API calls and adjusts WhatsApp backfill scheduling/locking, which could cause auth failures or missed/duplicated backfills if edge cases aren’t covered.

Overview
Ops memory extraction now prefers Claude Code OAuth over API keys. ops-memory-extractor.sh adds resolve_auth to fetch an unexpired OAuth token from macOS Keychain (with required extra header), falls back to API keys from env/keychain/Doppler, and updates curl calls to use the resolved header/mode.

wacli keepalive defers initial backfill and hardens batch locking. wacli-keepalive.sh removes the pre---follow startup backfill, schedules the first periodic_backfill after a short stabilization delay, and makes acquire_wacli_batch/release_wacli_batch re-entrant so the batch marker remains set for the entire backfill subshell.

Reviewed by Cursor Bugbot for commit 666c3be. Bugbot is set up for automated code reviews on this repo. Configure here.

Summary by CodeRabbit

  • Improvements
    • Enhanced API key retrieval to try environment, macOS Keychain, and fallback secret store for more reliable and secure credential access.
    • Deferred initial message backfill and added reentrancy safeguards to backfill scheduling so periodic synchronization runs after stabilization and avoids redundant concurrent work.

Before this change the keepalive ran auto_detect_empty_chats → detect_missed_messages
→ run_backfill at startup, before `wacli sync --follow` had a chance to establish
a persistent WebSocket. Each `wacli history backfill` subprocess opens its own
connection to web.whatsapp.com, so with 40–50 queued chats the DNS lookups and
new-session handshakes race each other, hitting WhatsApp's fresh-session rate
limit. In practice this produced either floods of `failed to dial whatsapp web
websocket: ... lookup web.whatsapp.com: no such host` errors, or 30-second
on-demand history sync timeouts — blocking the transition to --follow for 5–20
minutes on every launchd restart and looking like wacli-sync was dying.

Fix: remove the pre-follow backfill block and set LAST_BACKFILL_TIME=0 so the
first iteration of the --follow supervisor loop fires periodic_backfill
immediately. periodic_backfill already pauses --follow via acquire_wacli_batch,
runs the same three functions through the warm session, and releases — no new
code path needed. The effect is that the persistent connection comes up in
seconds instead of minutes, and backfill runs serially against the existing
socket instead of flooding WhatsApp with parallel handshakes.

Net behaviour: --follow is now the first thing that gets established after
bootstrap + refresh_wacli_cache, exactly matching the design documented in the
script header ("Run wacli sync --follow for persistent message streaming").
chatgpt-codex-connector[bot]

This comment was marked as resolved.

cursor[bot]

This comment was marked as resolved.

devin-ai-integration[bot]

This comment was marked as resolved.

…before doppler

The daemon that cron-schedules memory-extractor runs under launchd without
interactive Doppler auth, so `doppler secrets get ANTHROPIC_API_KEY --plain`
returned empty and the script died with
  FATAL: ANTHROPIC_API_KEY not set and doppler unavailable
every 30 minutes. No memory files were ever written.

Fix: extend resolve_api_key() with a macOS Keychain lookup that runs BEFORE
the doppler path — `security find-generic-password -s ANTHROPIC_API_KEY -w`.
This works in launchd contexts (keychain is user-scoped, not shell-scoped)
and survives plugin upgrades. The doppler path is also made explicit about
project/config via new OPS_DOPPLER_PROJECT / OPS_DOPPLER_CONFIG env vars so
users whose daemon env IS scoped to Doppler still get a deterministic lookup
instead of depending on ambient CLI auth.

New precedence (top-down):
  1. ANTHROPIC_API_KEY already in env
  2. macOS Keychain service=ANTHROPIC_API_KEY (any account)
  3. Doppler with OPS_DOPPLER_PROJECT + OPS_DOPPLER_CONFIG
  4. Doppler with ambient scope

Seed the keychain entry with:
  security add-generic-password -U -s ANTHROPIC_API_KEY -a ops-daemon \
    -w "$(doppler secrets get ANTHROPIC_API_KEY --plain \
         --project <your-project> --config <your-config>)"

The setup wizard should do this automatically during ops:setup — tracked
separately.
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 18, 2026

Warning

Rate limit exceeded

@blocksorg[bot] has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 49 minutes and 34 seconds before requesting another review.

Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 49 minutes and 34 seconds.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: dd48a2e5-ef83-4074-959f-19e8b05cda35

📥 Commits

Reviewing files that changed from the base of the PR and between 77c01e4 and 06a09dd.

📒 Files selected for processing (2)
  • claude-ops/scripts/ops-memory-extractor.sh
  • claude-ops/scripts/wacli-keepalive.sh
📝 Walkthrough

Walkthrough

Add macOS Keychain as a fallback when resolving ANTHROPIC_API_KEY; defer and stabilize wacli backfill by adding reentrancy guards, introducing an initial backfill delay, and holding the batch lock during periodic backfill.

Changes

Cohort / File(s) Summary
API Key Resolution
claude-ops/scripts/ops-memory-extractor.sh
resolve_api_key() now checks environment, then macOS Keychain (security find-generic-password -s ANTHROPIC_API_KEY -w), and finally doppler secrets get (conditionally adding --project/--config when OPS_DOPPLER_PROJECT/OPS_DOPPLER_CONFIG are set). Error message updated to list env, keychain, and doppler attempts.
WACLI Backfill & Locking
claude-ops/scripts/wacli-keepalive.sh
Removed immediate initial calls to auto_detect_empty_chats, detect_missed_messages, and run_backfill. Added _WACLI_BATCH_HELD guards so nested acquire_wacli_batch/release_wacli_batch become no-ops. Introduced INITIAL_BACKFILL_DELAY and adjusted LAST_BACKFILL_TIME so first periodic backfill runs after a stabilization delay. Periodic backfill's background subshell now acquires the batch and holds the lock across its work, sets/clears the held flag, then releases the lock.

Sequence Diagram(s)

sequenceDiagram
    participant Script as Script
    participant Env as Environment
    participant Keychain as macOS Keychain
    participant Doppler as Doppler Service

    Script->>Env: check ANTHROPIC_API_KEY
    alt found in environment
        Env-->>Script: return API key
    else not found
        Script->>Keychain: security find-generic-password -s ANTHROPIC_API_KEY -w
        alt found in keychain
            Keychain-->>Script: return API key
        else not found
            Script->>Doppler: doppler secrets get ANTHROPIC_API_KEY<br/>(+ --project/--config if set)
            alt retrieved from Doppler
                Doppler-->>Script: return API key
            else all sources exhausted
                Script-->>Script: error listing env, Keychain, Doppler attempts
            end
        end
    end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐰 I hopped through env, then Keychain's door,
Doppler last to check the score.
Backfill waits till sessions steady,
Locks held tight so tasks run ready.
A cheerful thump — the scripts are more!

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately describes the main change: deferring initial backfill until --follow is connected, which is the core fix addressing the WhatsApp rate-limit issue.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/wacli-persistent-connection

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

coderabbitai[bot]

This comment was marked as resolved.

Address three race conditions flagged in PR review:

1. LAST_BACKFILL_TIME=0 fired backfill immediately on the first supervisor
   tick, tearing down the just-started --follow connection before it could
   stabilise. Replaced with a 30s stabilisation delay (INITIAL_BACKFILL_DELAY)
   so --follow has time to establish the persistent WebSocket.

2. detect_missed_messages called release_wacli_batch internally, removing
   BATCH_MARKER while the background subshell was still running. The main
   supervisor loop then saw dead sync + no marker and broke out, creating a
   restart loop. Fixed by holding the batch lock at the subshell level via
   _WACLI_BATCH_HELD, making inner acquire/release calls no-ops.

3. auto_detect_empty_chats ran inside periodic_backfill before
   acquire_wacli_batch, failing silently due to store-lock contention with
   the running --follow process. Now runs after the subshell acquires the
   batch lock.
Claude Code stores an OAuth access token in the macOS Keychain under service
"Claude Code-credentials" (JSON blob → .claudeAiOauth.accessToken, format
sk-ant-oat01-*). When present and non-expired, that token authorises calls to
api.anthropic.com via `Authorization: Bearer <token>` + the
`anthropic-beta: oauth-2025-04-20` header, and the usage is billed against
the user's Max/Pro subscription instead of per-token API rates.

This matters here because memory-extractor runs every 30 min under launchd
and was previously blocked on a missing ANTHROPIC_API_KEY. Even after my
previous commit added a keychain + doppler fallback for the API key, a user
on a Claude Max plan would still be paying API rates for background work.

There is also a known behavioural gotcha with Claude Code itself: if
ANTHROPIC_API_KEY is exported in the shell (profile, direnv, etc.) Claude
Code prefers that key over the signed-in OAuth session, silently switching
the user from subscription billing to metered billing. Preferring OAuth
here — and pointedly NOT exporting ANTHROPIC_API_KEY — sidesteps that.

Changes in this commit:

- `resolve_api_key` replaced by `resolve_auth` with precedence:
    1. Claude Code OAuth token from keychain (requires ≥ 60 s remaining life
       to avoid mid-call expiry)
    2. $ANTHROPIC_API_KEY env var
    3. macOS Keychain service "ANTHROPIC_API_KEY"
    4. Doppler (respects OPS_DOPPLER_PROJECT / OPS_DOPPLER_CONFIG)
  Emits three globals consumed by call_claude: OPS_AUTH_HEADER,
  OPS_AUTH_MODE ("oauth"|"apikey"), OPS_AUTH_EXTRA_HEADERS (array; OAuth
  mode adds the anthropic-beta header).
- `resolve_api_key` kept as a back-compat alias.
- `call_claude` now emits `OPS_AUTH_HEADER` and expands
  `OPS_AUTH_EXTRA_HEADERS[@]` instead of hardcoding `x-api-key`.
- Neither path exports the credential to child processes — reduces blast
  radius and avoids poisoning Claude Code sessions launched from the same
  shell.

Drive-by fix: two `<<PYEOF` heredocs were unquoted, so bash expanded
backticks inside the embedded Python regex
    re.sub(r'^```(?:json)?\s*', '', raw.strip())
as command substitution, which blew up with
    "syntax error near unexpected token `?\s*'".
This made the JSON-parse step silently fail and produced "Total memory
size: 0 bytes" on every successful API call. Quoted both heredoc
delimiters to `<<'PYEOF'`. Args after the delimiter on the same line are
unaffected by quoting.

Validated end-to-end with ANTHROPIC_API_KEY unset and the keychain API-key
entry removed: OAuth path activated, call_claude logged
`Auth: Claude Code OAuth (subscription — no per-token billing)`, API
returned 200, and 12 memory files were written (9 contacts, preferences,
topics_active, donts) totalling 8.8 KB.
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (1)
claude-ops/scripts/wacli-keepalive.sh (1)

440-457: ⚠️ Potential issue | 🟠 Major

Keep the restart signal separate from the active batch lock.

The background subshell now holds BATCH_MARKER, but the supervisor still removes that same marker at Line 564 before restarting. That can restart sync --follow while the subshell is still backfilling; if the subshell releases first, Line 560 can instead treat the self-killed sync as an unexpected exit.

Use separate “batch active” and “restart requested” markers, or wait for the active marker to clear before restarting without deleting it from the supervisor path.

Suggested direction
+# Separate "active batch" from "sync was intentionally killed for a batch".
+BATCH_RESTART_MARKER="$STORE/.batch_restart_wacli"
+
 acquire_wacli_batch() {
   # Reentrant: if an outer caller already holds the batch, skip the real work.
   # _WACLI_BATCH_HELD is set by periodic_backfill's subshell to prevent inner
   # functions (detect_missed_messages, write_backfill_memory) from releasing the
   # marker while the subshell is still running.
   if [[ "${_WACLI_BATCH_HELD:-0}" == "1" ]]; then return 0; fi
-  touch "$BATCH_MARKER"
+  touch "$BATCH_MARKER" "$BATCH_RESTART_MARKER"
   ...
 }

 ...
   (
     # Hold the batch lock for the entire subshell so the supervisor loop
     # always sees BATCH_MARKER while we are working.  _WACLI_BATCH_HELD
     # makes the inner acquire/release calls in detect_missed_messages and
     # write_backfill_memory no-ops, preventing premature marker removal.
     acquire_wacli_batch
     _WACLI_BATCH_HELD=1
+    trap '_WACLI_BATCH_HELD=0; release_wacli_batch' EXIT

     auto_detect_empty_chats
     detect_missed_messages
     if [[ -f "$STORE/.backfill_jids" ]] && [[ -s "$STORE/.backfill_jids" ]]; then
       run_backfill
       write_backfill_memory
     fi

-    _WACLI_BATCH_HELD=0
-    release_wacli_batch
   ) &

 ...
-  if ! check_pause_signal && [[ ! -f "$BATCH_MARKER" ]]; then
+  if ! check_pause_signal && [[ ! -f "$BATCH_RESTART_MARKER" ]]; then
     break
   fi
-  # Clear any stale batch marker before restarting
-  rm -f "$BATCH_MARKER"
+  while [[ -f "$BATCH_MARKER" ]]; do
+    sleep 2
+  done
+  rm -f "$BATCH_RESTART_MARKER"
 done

Also applies to: 560-564

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@claude-ops/scripts/wacli-keepalive.sh` around lines 440 - 457, The supervisor
is removing the same BATCH_MARKER used by the background subshell, causing races
between backfill and restart; change the protocol so batch locking and restart
requests use separate markers and checks: keep
acquire_wacli_batch/release_wacli_batch and _WACLI_BATCH_HELD to manage the
batch marker only, add a new request_wacli_restart (or a
$STORE/.restart_requested marker) that the subshell sets when it wants a restart
instead of deleting the batch marker, and modify the supervisor restart logic to
check for the batch-active marker (via is_wacli_batch_active or checking
BATCH_MARKER) and wait for it to clear before proceeding while only removing the
restart marker (not the batch marker); ensure any code path that currently
deletes the batch marker is updated to delete the restart marker instead.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@claude-ops/scripts/wacli-keepalive.sh`:
- Around line 440-457: The supervisor is removing the same BATCH_MARKER used by
the background subshell, causing races between backfill and restart; change the
protocol so batch locking and restart requests use separate markers and checks:
keep acquire_wacli_batch/release_wacli_batch and _WACLI_BATCH_HELD to manage the
batch marker only, add a new request_wacli_restart (or a
$STORE/.restart_requested marker) that the subshell sets when it wants a restart
instead of deleting the batch marker, and modify the supervisor restart logic to
check for the batch-active marker (via is_wacli_batch_active or checking
BATCH_MARKER) and wait for it to clear before proceeding while only removing the
restart marker (not the batch marker); ensure any code path that currently
deletes the batch marker is updated to delete the restart marker instead.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 0fc8a440-fdab-45b7-a73f-e94112f8252b

📥 Commits

Reviewing files that changed from the base of the PR and between 7eec07f and 77c01e4.

📒 Files selected for processing (1)
  • claude-ops/scripts/wacli-keepalive.sh

…ions

The find-generic-password call omitted -a ops-daemon, while the seed
command in the header comments used it. Without the account filter,
security(1) returns the first matching service entry across all
accounts — wrong key if multiple ANTHROPIC_API_KEY entries exist.

Add -a ops-daemon to the lookup and update the precedence comment
from '(any account)' to 'account "ops-daemon"'.
blocksorg[bot]

This comment was marked as resolved.

chatgpt-codex-connector[bot]

This comment was marked as resolved.

…in account filter

- Make periodic_backfill synchronous (remove background subshell) to fix a
  race where a fast-finishing backfill removes BATCH_MARKER before the
  supervisor loop checks it, causing the script to exit and launchd to
  restart in a boot loop.
- Leave BATCH_MARKER in place after backfill so the supervisor restarts
  sync instead of exiting; add early break in the monitor loop to skip
  refresh_wacli_cache when sync is already dead.
- Add -a ops-daemon to the keychain lookup in ops-memory-extractor.sh so
  it matches the seed command and reliably retrieves the daemon-specific
  API key entry.
@auroracapital auroracapital merged commit 4a4181a into main Apr 18, 2026
7 of 8 checks passed
@auroracapital auroracapital deleted the fix/wacli-persistent-connection branch April 18, 2026 14:07
auroracapital added a commit that referenced this pull request Apr 18, 2026
CodeRabbit (ops-memory-extractor.sh:117): add `-a ops-daemon` to the
`security find-generic-password` lookup so it matches the seed command
and reliably returns the daemon-specific keychain entry.

CodeRabbit/blocksorg (wacli-keepalive.sh): close the BATCH_MARKER race
where the supervisor could `rm -f BATCH_MARKER` and restart --follow
while the periodic_backfill subshell was still executing run_backfill.
Fix: capture `_BACKFILL_PID` after forking the subshell and wait for it
before clearing the marker in the outer restart block.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown

@devin-ai-integration devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 1 new potential issue.

View 7 additional findings in Devin Review.

Open in Devin Review

-H "x-api-key: ${ANTHROPIC_API_KEY}" \
-H "${OPS_AUTH_HEADER}" \
-H "anthropic-version: 2023-06-01" \
"${OPS_AUTH_EXTRA_HEADERS[@]}" \
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Empty array expansion of OPS_AUTH_EXTRA_HEADERS crashes script on macOS bash 3.2 under set -u

When using API key auth (not OAuth), OPS_AUTH_EXTRA_HEADERS is set to an empty array () at ops-memory-extractor.sh:82. The curl command at line 306 expands "${OPS_AUTH_EXTRA_HEADERS[@]}", which under set -u (line 5) causes a fatal "unbound variable" error on bash < 4.4. macOS ships with bash 3.2 (due to GPLv3 licensing), and this script explicitly targets macOS (security keychain, BSD date -v, launchd references). This means the entire API key fallback path — the primary auth method for users without Claude Code OAuth — is broken on macOS with the default system bash.

Suggested change
"${OPS_AUTH_EXTRA_HEADERS[@]}" \
${OPS_AUTH_EXTRA_HEADERS[@]+"${OPS_AUTH_EXTRA_HEADERS[@]}"} \
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

-H "x-api-key: ${ANTHROPIC_API_KEY}" \
-H "${OPS_AUTH_HEADER}" \
-H "anthropic-version: 2023-06-01" \
"${OPS_AUTH_EXTRA_HEADERS[@]}" \
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Empty-array expansion with set -u will crash on bash < 4.4 (macOS system bash 3.2)

Both scripts declare set -euo pipefail. In the API-key auth path resolve_auth sets OPS_AUTH_EXTRA_HEADERS=() (empty array). On bash < 4.4 expanding an empty array subscript under set -u throws:

-bash: OPS_AUTH_EXTRA_HEADERS[@]: unbound variable

This kills the curl call and causes die to fire — the memory extractor exits without writing any files. It's a silent regression for users whose launchd PATH resolves to /bin/bash (macOS system bash 3.2) rather than a Homebrew bash 5.x, which is the common launchd case when the plist doesn't explicitly extend PATH.

The OAuth path is unaffected (array is non-empty: (-H "anthropic-beta: oauth-2025-04-20")), but any user falling through to the API-key path will hit this.

Fix — use the conditional-expansion form, which is safe on bash 3.2 with set -u:

# Before
    "${OPS_AUTH_EXTRA_HEADERS[@]}" \

# After
    ${OPS_AUTH_EXTRA_HEADERS[@]+"${OPS_AUTH_EXTRA_HEADERS[@]}"} \

${var[@]+word} does not trigger nounset — it explicitly tests whether the parameter is set before expanding — and with an empty array the condition is false so nothing is emitted, avoiding the unbound-variable error.

auroracapital added a commit that referenced this pull request Apr 18, 2026
Minor release bundling three feature PRs and multiple security/stability fixes:

- PR #141: /gtm — cross-channel go-to-market planning skill
- PR #139: /ops:projects portfolio dashboard + GSD registry sync
- PR #140: ops-speedup v2 parity (GPU/ANE monitoring, power hogs, OS actions)
- PR #138: ops-memory-extractor Claude Code OAuth support + wacli persistent --follow fix

Bug fixes (see CHANGELOG.md [1.7.0] for full detail):
- SEV-9: ops-speedup eval shell-injection (Seer)
- SEV-9: ops-projects hardcoded /Users/ path breaking for all other users (Seer + blocksorg + cursor + devin + codex)
- SEV-8: ops-speedup RETURN trap race + systemd mask allowlist
- SEV-7: ops-speedup lsof +D probe wedge, daemon-services backing-script gap, ops-projects AskUserQuestion allowed-tools mismatch
- wacli --follow torn down by immediate backfill (now INITIAL_BACKFILL_DELAY=30 + reentrant guard)

Bumps claude-ops/package.json, claude-ops/.claude-plugin/plugin.json, and
.claude-plugin/marketplace.json plugins[0].version 1.6.2 → 1.7.0.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant