Skip to content

fix(continuous-learning-v2): observer background process crashes immediately#312

Merged
affaan-m merged 3 commits into
affaan-m:mainfrom
zzzhizhia:fix/observer-background-process
Mar 3, 2026
Merged

fix(continuous-learning-v2): observer background process crashes immediately#312
affaan-m merged 3 commits into
affaan-m:mainfrom
zzzhizhia:fix/observer-background-process

Conversation

@zzzhizhia

@zzzhizhia zzzhizhia commented Mar 1, 2026

Copy link
Copy Markdown
Contributor

Summary

The observer background process (start-observer.sh) crashes immediately after starting due to three independent bugs. This PR fixes all three and also fixes the Haiku prompt so that generated instinct files match the format expected by instinct-cli.py.

Bug 1: Nested session detection

When launched from a Claude Code session, the child process inherits the CLAUDECODE env var, causing claude CLI to refuse with "cannot be launched inside another Claude Code session".

Fix: unset CLAUDECODE in the background process.

Bug 2: set -e kills the loop

The parent script's set -e is inherited by the subshell. When claude exits non-zero (e.g. max turns reached), the entire observer loop dies silently.

Fix: set +e in the background process.

Bug 3: Subshell dies when parent exits

( ... ) & disown loses IO handles when the parent shell exits, killing the background process within seconds.

Fix: Use nohup /bin/bash -c '...' for full detachment, and sleep & wait pattern to allow SIGUSR1 to interrupt sleep without killing the process.

Bonus: Inline instinct format in prompt

The previous prompt told Haiku to follow "the format in the observer agent spec", but Haiku has no way to read observer.md. This resulted in free-form Markdown files that instinct-cli.py could not parse (missing YAML frontmatter / id field). The prompt now includes the exact format inline.

Test plan

  • start-observer.sh start → process stays alive after 10+ seconds
  • start-observer.sh status → correctly reports running PID
  • start-observer.sh stop → cleanly stops the process
  • Observer survives parent shell exit
  • Observer survives claude CLI failures (non-zero exit)
  • Generated instinct files have correct YAML frontmatter format
  • instinct-cli.py status correctly parses generated files

Related


Summary by cubic

Fixes observer background process crashes and makes the service robust and signal-safe. Instinct generation now uses a strict YAML format and archives processed observations to prevent duplicates and reprocessing.

  • Bug Fixes

    • Start via nohup with env-passed vars; interruptible sleep/wait; verify PID is alive on start.
    • Unset CLAUDECODE and disable set -e so claude failures don’t kill the loop.
    • Stronger process management: track/kill SLEEP_PID, USR1 handler clears pending sleep, USR1_FIRED prevents double runs, unified cleanup.
    • Guard PID file cleanup to avoid restart races; only remove if it belongs to the current process.
  • New Features

    • Prompt inlines the exact instinct spec: mandatory YAML frontmatter (id/trigger/confidence/domain/source), allowed domains, confidence guidance, project/global scope, no code snippets, and update existing instead of duplicating.
    • Archive processed observations to observations.archive/ after analysis.

Written for commit e5fd15f. Summary will update on new commits.

Summary by CodeRabbit

  • Improvements

    • Observer now runs as a fully detached background service with stronger startup validation, clearer success/failure messages, and refined timing/error handling.
    • More responsive, interruptible scheduling that avoids duplicate runs when manually triggered.
  • New Features

    • Signal-triggered on-demand analysis that runs immediately without disrupting schedule.
    • Automatic archival of processed observations with timestamped filenames.
    • Stricter, expanded structured output for analysis results including required fields and an evidence block.

Copilot AI review requested due to automatic review settings March 1, 2026 14:58
@coderabbitai

coderabbitai Bot commented Mar 1, 2026

Copy link
Copy Markdown
Contributor

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2f80af3 and e5fd15f.

📒 Files selected for processing (1)
  • skills/continuous-learning-v2/agents/start-observer.sh

📝 Walkthrough

Walkthrough

Runs the observer as a fully detached nohup background process with PID-file lifecycle and verification, interruptible sleep loop, USR1-triggered on-demand analysis, timestamped archival of observations, and stricter Claude analysis output requiring YAML instincts frontmatter.

Changes

Cohort / File(s) Summary
Observer Startup Script Rework
skills/continuous-learning-v2/agents/start-observer.sh
Converted to detached nohup execution with exported env vars (CONFIG_DIR, PID_FILE, LOG_FILE, OBSERVATIONS_FILE); creates/cleans PID file with traps; uses interruptible background sleep + wait loop; adds USR1 handler to cancel sleep and trigger immediate analysis; archives processed observations with timestamped filenames; tightens Claude prompt and enforces YAML-frontmatter instincts output; improves startup liveness check and logging.

Sequence Diagram(s)

sequenceDiagram
    autonumber
    participant Starter as "Starter (invoker)"
    participant Observer as "start-observer.sh (detached)"
    participant FS as "Filesystem (obs file / archive)"
    participant Claude as "Claude (analysis)"
    Starter->>Observer: nohup bash -c (export envs + start detached block)
    Observer->>Observer: write PID_FILE, setup traps & cleanup
    Observer->>Observer: spawn interruptible background sleep (sleep + wait)
    alt USR1 received
        Observer->>Observer: cancel sleep, set USR1 flag, run on_usr1
        Observer->>FS: read `OBSERVATIONS_FILE`
        Observer->>Claude: send analysis request (strict YAML frontmatter)
        Claude-->>Observer: return instincts YAML
        Observer->>FS: move processed file to archive (timestamp) & recreate observations file
    else periodic timer triggers
        Observer->>FS: read `OBSERVATIONS_FILE`
        Observer->>Claude: send scheduled analysis request
        Claude-->>Observer: return instincts YAML
        Observer->>FS: archive processed observations
    end
    Observer->>Starter: print startup success (PID & log) or failure if process died
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐇 I hopped out, nohup cloak around my back,
PID tucked, I pat the midnight track.
USR1 drums — I pause, I leap, I heed,
Claude hums YAML, I file the deed. ✨

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'fix(continuous-learning-v2): observer background process crashes immediately' clearly and specifically describes the main change—fixing immediate crashes of the observer background process.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
skills/continuous-learning-v2/agents/start-observer.sh (1)

77-80: Variable injection can break on paths with single quotes.

The '"$VAR"' pattern will produce malformed bash if any path contains a single quote (e.g., /home/user's-dir/.claude/...). While uncommon, this can cause silent failures.

A safer alternative is to use printf '%q' to escape values or export the variables so the child inherits them.

Alternative: export variables before nohup
+    # Export for child process
+    export CONFIG_DIR PID_FILE LOG_FILE OBSERVATIONS_FILE
+
     # The observer loop — fully detached with nohup, IO redirected to log
     nohup /bin/bash -c '
       set +e
       unset CLAUDECODE
-
-      CONFIG_DIR="'"$CONFIG_DIR"'"
-      PID_FILE="'"$PID_FILE"'"
-      LOG_FILE="'"$LOG_FILE"'"
-      OBSERVATIONS_FILE="'"$OBSERVATIONS_FILE"'"
+      # Variables inherited from parent via export

       trap "rm -f \"$PID_FILE\"; exit 0" TERM INT
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@skills/continuous-learning-v2/agents/start-observer.sh` around lines 77 - 80,
The current inline assignment using the pattern
'"$CONFIG_DIR"','"$PID_FILE"','"$LOG_FILE"','"$OBSERVATIONS_FILE"' can break
when values contain single quotes; update start-observer.sh to either (a) escape
each variable before embedding by using printf '%q' for CONFIG_DIR, PID_FILE,
LOG_FILE and OBSERVATIONS_FILE so the generated shell fragment is safe, or (b)
simpler: export those variables in the parent shell (export CONFIG_DIR PID_FILE
LOG_FILE OBSERVATIONS_FILE) and invoke nohup/child without embedding quoted
values so the child inherits them; modify the code that writes/echoes those
lines to use one of these approaches and ensure the unique symbols CONFIG_DIR,
PID_FILE, LOG_FILE and OBSERVATIONS_FILE are handled accordingly.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@skills/continuous-learning-v2/agents/start-observer.sh`:
- Around line 147-150: The loop currently runs sleep 300 & wait $! then always
calls analyze_observations, which causes a duplicate run when the USR1 trap
already invoked analyze_observations; fix this by introducing a skip flag (e.g.,
SKIP_SCHEDULED_ANALYSIS) that the USR1 trap sets to true before calling
analyze_observations, and in the while loop (the sleep/wait block that calls
analyze_observations) check that flag: if set, clear it and skip the scheduled
analyze_observations invocation; reference the existing analyze_observations
function and the while true; do sleep 300 & wait $! loop to locate where to add
the flag guard and where the trap should set the flag.

---

Nitpick comments:
In `@skills/continuous-learning-v2/agents/start-observer.sh`:
- Around line 77-80: The current inline assignment using the pattern
'"$CONFIG_DIR"','"$PID_FILE"','"$LOG_FILE"','"$OBSERVATIONS_FILE"' can break
when values contain single quotes; update start-observer.sh to either (a) escape
each variable before embedding by using printf '%q' for CONFIG_DIR, PID_FILE,
LOG_FILE and OBSERVATIONS_FILE so the generated shell fragment is safe, or (b)
simpler: export those variables in the parent shell (export CONFIG_DIR PID_FILE
LOG_FILE OBSERVATIONS_FILE) and invoke nohup/child without embedding quoted
values so the child inherits them; modify the code that writes/echoes those
lines to use one of these approaches and ensure the unique symbols CONFIG_DIR,
PID_FILE, LOG_FILE and OBSERVATIONS_FILE are handled accordingly.

ℹ️ Review info

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2d3be88 and 561f9eb.

📒 Files selected for processing (1)
  • skills/continuous-learning-v2/agents/start-observer.sh

Comment thread skills/continuous-learning-v2/agents/start-observer.sh Outdated

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 561f9eb7bf

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread skills/continuous-learning-v2/agents/start-observer.sh Outdated
Comment thread skills/continuous-learning-v2/agents/start-observer.sh Outdated

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes the continuous-learning-v2 observer background process so it stays running reliably when launched from interactive Claude Code sessions, and ensures generated instinct files conform to instinct-cli.py’s expected frontmatter format.

Changes:

  • Detach the observer loop using nohup /bin/bash -c ... and redirect output to the observer log.
  • Prevent immediate loop termination by disabling inherited set -e behavior and unsetting CLAUDECODE in the background process.
  • Update the Haiku prompt to include the full instinct file format inline (YAML frontmatter + required id).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread skills/continuous-learning-v2/agents/start-observer.sh Outdated
Comment thread skills/continuous-learning-v2/agents/start-observer.sh Outdated

@cubic-dev-ai cubic-dev-ai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No issues found across 1 file

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (2)
skills/continuous-learning-v2/agents/start-observer.sh (2)

181-183: Replace fixed startup delay with bounded readiness polling.

Line 182’s fixed sleep 2 can still false-fail on slower machines/filesystems. A short polling loop with timeout is more reliable.

⏱️ Suggested refactor
-    # Wait for PID file
-    sleep 2
+    # Wait up to 10s for PID file
+    for _ in {1..40}; do
+      [ -f "$PID_FILE" ] && break
+      sleep 0.25
+    done
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@skills/continuous-learning-v2/agents/start-observer.sh` around lines 181 -
183, The fixed sleep (sleep 2) used to "Wait for PID file" can false-fail on
slow systems; replace it with a bounded polling loop that repeatedly checks for
the PID file's existence (e.g., the variable holding the PID path such as
OBSERVER_PID_FILE or PID_FILE used earlier in start-observer.sh) with a short
sleep (e.g., 0.2s) and an overall timeout (e.g., 10s); if the file appears break
and continue, otherwise after timeout emit an error and exit non-zero—implement
the loop where the current sleep 2 sits and preserve existing logging/exit
behavior.

26-31: Add explicit PID validation for defense-in-depth.

Lines 27, 45, 65, and 185 use PID values from the PID file with kill. While kill -0 provides a safety check, explicit validation before reading the PID improves code clarity and robustness. Consider adding a validation function to ensure the PID file contains only positive integers before use, rejecting malformed values (e.g., negative numbers, special values like -1).

Suggested validation function
+is_valid_pid() {
+  [[ "$1" =~ ^[1-9][0-9]*$ ]]
+}
+
 case "${1:-start}" in
   stop)
     if [ -f "$PID_FILE" ]; then
       pid=$(cat "$PID_FILE")
+      if ! is_valid_pid "$pid"; then
+        echo "Observer not running (invalid PID file)."
+        rm -f "$PID_FILE"
+        exit 0
+      fi
       if kill -0 "$pid" 2>/dev/null; then
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@skills/continuous-learning-v2/agents/start-observer.sh` around lines 26 - 31,
Add a defensive PID validation routine (e.g., is_valid_pid or validate_pid) that
reads the PID string from PID_FILE and ensures it is a non-zero positive integer
(reject negative numbers, zero, empty strings, and values containing non-digits
or special prefixes like "-1"); call this validator wherever the script reads
pid from PID_FILE (references: PID_FILE variable and the blocks that call kill
and kill -0) and if validation fails log an error, remove or ignore the
malformed PID_FILE, and avoid calling kill with the bad value.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@skills/continuous-learning-v2/agents/start-observer.sh`:
- Around line 181-183: The fixed sleep (sleep 2) used to "Wait for PID file" can
false-fail on slow systems; replace it with a bounded polling loop that
repeatedly checks for the PID file's existence (e.g., the variable holding the
PID path such as OBSERVER_PID_FILE or PID_FILE used earlier in
start-observer.sh) with a short sleep (e.g., 0.2s) and an overall timeout (e.g.,
10s); if the file appears break and continue, otherwise after timeout emit an
error and exit non-zero—implement the loop where the current sleep 2 sits and
preserve existing logging/exit behavior.
- Around line 26-31: Add a defensive PID validation routine (e.g., is_valid_pid
or validate_pid) that reads the PID string from PID_FILE and ensures it is a
non-zero positive integer (reject negative numbers, zero, empty strings, and
values containing non-digits or special prefixes like "-1"); call this validator
wherever the script reads pid from PID_FILE (references: PID_FILE variable and
the blocks that call kill and kill -0) and if validation fails log an error,
remove or ignore the malformed PID_FILE, and avoid calling kill with the bad
value.

ℹ️ Review info

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 561f9eb and 29c413e.

📒 Files selected for processing (1)
  • skills/continuous-learning-v2/agents/start-observer.sh

@affaan-m affaan-m left a comment

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Automated review: this PR has merge conflicts. Please rebase or resolve.

@affaan-m affaan-m left a comment

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Automated review: this PR has merge conflicts. Please rebase or resolve.

@affaan-m affaan-m left a comment

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Automated review: this PR has merge conflicts. Please rebase or resolve.

@affaan-m affaan-m left a comment

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Automated review: this PR has merge conflicts. Please rebase or resolve.

@affaan-m affaan-m left a comment

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Automated review: this PR has merge conflicts. Please rebase or resolve.

@affaan-m affaan-m left a comment

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Automated review: this PR has merge conflicts. Please rebase or resolve.

@affaan-m affaan-m left a comment

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Automated review: this PR has merge conflicts. Please rebase or resolve.

@affaan-m affaan-m left a comment

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Automated review: this PR has merge conflicts. Please rebase or resolve.

@affaan-m affaan-m left a comment

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Automated review: this PR has merge conflicts. Please rebase or resolve.

@affaan-m affaan-m left a comment

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Automated review: this PR has merge conflicts. Please rebase or resolve.

@affaan-m affaan-m left a comment

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Automated review: this PR has merge conflicts. Please rebase or resolve.

…diately

Three bugs prevent the observer from running:

1. Nested session detection: When launched from a Claude Code session,
   the child process inherits CLAUDECODE env var, causing `claude` CLI
   to refuse with "cannot be launched inside another session". Fix: unset
   CLAUDECODE in the background process.

2. set -e kills the loop: The parent script's `set -e` is inherited by
   the subshell. When `claude` exits non-zero (e.g. max turns reached),
   the entire observer loop dies. Fix: `set +e` in the background process.

3. Subshell dies when parent exits: `( ... ) & disown` loses IO handles
   when the parent shell exits, killing the background process. Fix: use
   `nohup /bin/bash -c '...'` for full detachment, and `sleep & wait`
   to allow SIGUSR1 to interrupt sleep without killing the process.

Additionally, the prompt for Haiku now includes the exact instinct file
format inline (YAML frontmatter with id/trigger/confidence/domain/source
fields), since the previous prompt referenced "the observer agent spec"
which Haiku could not actually read, resulting in instinct files that
the CLI parser could not parse.
- Use `env` to pass variables to child process instead of quote-splicing,
  avoiding shell injection risk from special chars in paths
- Add USR1_FIRED flag to prevent double analysis when SIGUSR1 interrupts
  the sleep/wait cycle
- Track SLEEP_PID and kill it in both TERM trap and USR1 handler to
  prevent orphaned sleep processes from accumulating
- Consolidate cleanup logic into a dedicated cleanup() function
@zzzhizhia zzzhizhia force-pushed the fix/observer-background-process branch from 29c413e to 2f80af3 Compare March 2, 2026 07:06

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
skills/continuous-learning-v2/agents/start-observer.sh (1)

259-260: Prefer a short poll loop over a fixed 2s startup wait.

A fixed delay can cause flaky “failed to start” reports on slower machines.

💡 Proposed refactor
-    # Wait for PID file
-    sleep 2
+    # Wait up to 10s for PID file
+    for _ in {1..50}; do
+      [ -f "$PID_FILE" ] && break
+      sleep 0.2
+    done
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@skills/continuous-learning-v2/agents/start-observer.sh` around lines 259 -
260, Replace the fixed "sleep 2" startup wait with a short poll loop that checks
for the observer PID file existence (the PID file referenced where "sleep 2"
currently sits) at a small interval (e.g., 100–250ms), stop polling once the PID
file appears, and enforce a reasonable overall timeout after which the script
logs a clear failure and exits non‑zero; update the block around the existing
"sleep 2" call to perform this polling, use the same PID file variable name used
elsewhere in the script (e.g., PID_FILE or PIDFILE), and ensure to log
success/failure so callers don’t get flaky “failed to start” reports.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@skills/continuous-learning-v2/agents/start-observer.sh`:
- Around line 158-161: The cleanup function currently unconditionally removes
PID_FILE which can delete a newer observer's PID during restart races; modify
cleanup (the cleanup function that uses SLEEP_PID and PID_FILE) to first read
the PID stored in PID_FILE and only rm -f "$PID_FILE" if that stored PID equals
the current process ID ($$), otherwise leave the file untouched; keep the
existing kill of SLEEP_PID and the exit behavior unchanged.

---

Nitpick comments:
In `@skills/continuous-learning-v2/agents/start-observer.sh`:
- Around line 259-260: Replace the fixed "sleep 2" startup wait with a short
poll loop that checks for the observer PID file existence (the PID file
referenced where "sleep 2" currently sits) at a small interval (e.g.,
100–250ms), stop polling once the PID file appears, and enforce a reasonable
overall timeout after which the script logs a clear failure and exits non‑zero;
update the block around the existing "sleep 2" call to perform this polling, use
the same PID file variable name used elsewhere in the script (e.g., PID_FILE or
PIDFILE), and ensure to log success/failure so callers don’t get flaky “failed
to start” reports.

ℹ️ Review info

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 29c413e and 2f80af3.

📒 Files selected for processing (1)
  • skills/continuous-learning-v2/agents/start-observer.sh

Comment thread skills/continuous-learning-v2/agents/start-observer.sh
Only remove PID file in cleanup trap if it still belongs to the
current process, preventing a restarted observer from losing its
PID file when the old process exits.
@affaan-m affaan-m merged commit adc0f67 into affaan-m:main Mar 3, 2026
3 checks passed
floatingman pushed a commit to floatingman/everything-claude-code that referenced this pull request Mar 22, 2026
…diately (affaan-m#312)

* fix(continuous-learning-v2): observer background process crashes immediately

Three bugs prevent the observer from running:

1. Nested session detection: When launched from a Claude Code session,
   the child process inherits CLAUDECODE env var, causing `claude` CLI
   to refuse with "cannot be launched inside another session". Fix: unset
   CLAUDECODE in the background process.

2. set -e kills the loop: The parent script's `set -e` is inherited by
   the subshell. When `claude` exits non-zero (e.g. max turns reached),
   the entire observer loop dies. Fix: `set +e` in the background process.

3. Subshell dies when parent exits: `( ... ) & disown` loses IO handles
   when the parent shell exits, killing the background process. Fix: use
   `nohup /bin/bash -c '...'` for full detachment, and `sleep & wait`
   to allow SIGUSR1 to interrupt sleep without killing the process.

Additionally, the prompt for Haiku now includes the exact instinct file
format inline (YAML frontmatter with id/trigger/confidence/domain/source
fields), since the previous prompt referenced "the observer agent spec"
which Haiku could not actually read, resulting in instinct files that
the CLI parser could not parse.

* fix: address review feedback on observer process management

- Use `env` to pass variables to child process instead of quote-splicing,
  avoiding shell injection risk from special chars in paths
- Add USR1_FIRED flag to prevent double analysis when SIGUSR1 interrupts
  the sleep/wait cycle
- Track SLEEP_PID and kill it in both TERM trap and USR1 handler to
  prevent orphaned sleep processes from accumulating
- Consolidate cleanup logic into a dedicated cleanup() function

* fix: guard PID file cleanup against race condition on restart

Only remove PID file in cleanup trap if it still belongs to the
current process, preventing a restarted observer from losing its
PID file when the old process exits.
FrancescoRosciano pushed a commit to FRosciano-Mambo/everything-claude-code that referenced this pull request Jun 1, 2026
…diately (affaan-m#312)

* fix(continuous-learning-v2): observer background process crashes immediately

Three bugs prevent the observer from running:

1. Nested session detection: When launched from a Claude Code session,
   the child process inherits CLAUDECODE env var, causing `claude` CLI
   to refuse with "cannot be launched inside another session". Fix: unset
   CLAUDECODE in the background process.

2. set -e kills the loop: The parent script's `set -e` is inherited by
   the subshell. When `claude` exits non-zero (e.g. max turns reached),
   the entire observer loop dies. Fix: `set +e` in the background process.

3. Subshell dies when parent exits: `( ... ) & disown` loses IO handles
   when the parent shell exits, killing the background process. Fix: use
   `nohup /bin/bash -c '...'` for full detachment, and `sleep & wait`
   to allow SIGUSR1 to interrupt sleep without killing the process.

Additionally, the prompt for Haiku now includes the exact instinct file
format inline (YAML frontmatter with id/trigger/confidence/domain/source
fields), since the previous prompt referenced "the observer agent spec"
which Haiku could not actually read, resulting in instinct files that
the CLI parser could not parse.

* fix: address review feedback on observer process management

- Use `env` to pass variables to child process instead of quote-splicing,
  avoiding shell injection risk from special chars in paths
- Add USR1_FIRED flag to prevent double analysis when SIGUSR1 interrupts
  the sleep/wait cycle
- Track SLEEP_PID and kill it in both TERM trap and USR1 handler to
  prevent orphaned sleep processes from accumulating
- Consolidate cleanup logic into a dedicated cleanup() function

* fix: guard PID file cleanup against race condition on restart

Only remove PID file in cleanup trap if it still belongs to the
current process, preventing a restarted observer from losing its
PID file when the old process exits.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants