fix(continuous-learning-v2): add lazy-start observer logic#508
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds config-driven lazy-start and race-free startup to the observer: when enabled but not running, acquire an OS lock (flock or macOS lockfile), revalidate PID files, start the observer if still absent, and deduplicate USR1 signaling to avoid signaling the same PID multiple times. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant Cron as Cron/Invoker
participant Observe as observe.sh
participant Config as Config (file / CLV2_CONFIG)
participant Lock as OS Lock (flock / lockfile)
participant Starter as start-observer.sh
participant PID as PID files (project/global)
participant Observer as Observer process
Cron->>Observe: run
Observe->>Config: read effective config (is observer.enabled?)
Observe->>PID: check project/global PID files (is-running / stale cleanup)
alt PID present
Observe->>Observer: send USR1 (dedupe per PID)
else enabled && no PID
Observe->>Lock: acquire (flock or lockfile)
Lock->>Observe: granted
Observe->>PID: re-check PID files (double-check / cleanup)
Observe->>Config: re-validate enabled
alt still missing & enabled
Observe->>Starter: invoke start-observer.sh
Starter->>Observer: start, write PID files
end
Observe->>Lock: release
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
📝 Coding Plan
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Greptile SummaryThis PR adds a lazy-start mechanism to Key changes and concerns:
Confidence Score: 2/5
Important Files Changed
Sequence DiagramsequenceDiagram
participant C as Claude Code Hook
participant O as observe.sh
participant L as Lock (flock/lockfile)
participant S as start-observer.sh
participant P as observer-loop.sh
C->>O: invoke (PreToolUse / PostToolUse)
O->>O: read stdin, check disabled, check guards
O->>O: parse input, write to observations.jsonl
O->>O: read EFFECTIVE_CONFIG → OBSERVER_ENABLED?
alt OBSERVER_ENABLED=true and no PID file
O->>L: acquire lock (flock or lockfile)
L-->>O: lock acquired
O->>O: double-check PID files (_CHECK_OBSERVER_RUNNING)
O->>S: nohup start-observer.sh start (backgrounded)
note over S: reads ${SKILL_ROOT}/config.json<br/>(ignores CLV2_CONFIG!)
S->>P: nohup observer-loop.sh (backgrounded)
P-->>S: writes .observer.pid (async, ~2s)
O->>L: release lock
end
alt Observer PID file exists
O->>O: validate PID, deduplicate
O->>P: kill -USR1 $observer_pid
end
Last reviewed commit: 0f3927c |
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
skills/continuous-learning-v2/hooks/observe.sh (1)
266-275:⚠️ Potential issue | 🟡 MinorSame observer can be signaled twice when both PID files point to the same process.
If project and global pid files contain the same PID, Line 272 runs twice, causing duplicate analysis triggers.
Proposed fix (dedupe signaled PIDs)
+signaled_pids=" " for pid_file in "${PROJECT_DIR}/.observer.pid" "${CONFIG_DIR}/.observer.pid"; do if [ -f "$pid_file" ]; then observer_pid=$(cat "$pid_file") + case "$signaled_pids" in + *" $observer_pid "*) continue ;; + esac echo "From observe.sh: Observer PID found: $observer_pid" >> ~/claude-hooks.log if kill -0 "$observer_pid" 2>/dev/null; then echo "date: $(date) From observe.sh: Sending USR1 signal to observer: $observer_pid" >> ~/claude-hooks.log kill -USR1 "$observer_pid" 2>/dev/null || true + signaled_pids="${signaled_pids}${observer_pid} " fi fi done🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@skills/continuous-learning-v2/hooks/observe.sh` around lines 266 - 275, The loop in observe.sh reads .observer.pid from both "${PROJECT_DIR}" and "${CONFIG_DIR}" and may call kill -USR1 twice when both files contain the same PID (observer_pid), causing duplicate triggers; fix by deduplicating PIDs before signaling: maintain a small in-memory set/list of already-signaled observer_pid values (e.g., signaled_pids) and check membership before executing the kill -USR1 on observer_pid, only sending the signal once per unique PID and still logging as currently done.
🧹 Nitpick comments (1)
skills/continuous-learning-v2/hooks/observe.sh (1)
17-17: Unconditional debug logging on every hook call will create noisy, unbounded log growth.These lines run on the hot path and append every invocation. Gate this behind a debug flag (and ideally rotate/size-cap logs).
Proposed fix
+DEBUG_HOOK_LOG="${CLV2_DEBUG_HOOK_LOG:-0}" +HOOK_LOG_FILE="${HOME}/claude-hooks.log" +log_debug() { + [ "$DEBUG_HOOK_LOG" = "1" ] || return 0 + printf '%s\n' "$1" >> "$HOOK_LOG_FILE" +} -echo "From observe.sh: Hook fired $(date)" >> ~/claude-hooks.log +log_debug "From observe.sh: Hook fired $(date)" @@ -echo "From observe.sh: CONFIG_DIR: $CONFIG_DIR" >> ~/claude-hooks.log -echo "From observe.sh: OBSERVATIONS_FILE: $OBSERVATIONS_FILE" >> ~/claude-hooks.log -echo "From observe.sh: CLAUDE_PROJECT_DIR: $CLAUDE_PROJECT_DIR" >> ~/claude-hooks.log -echo "From observe.sh: PYTHON_CMD: $PYTHON_CMD" >> ~/claude-hooks.log +log_debug "From observe.sh: CONFIG_DIR: $CONFIG_DIR" +log_debug "From observe.sh: OBSERVATIONS_FILE: $OBSERVATIONS_FILE" +log_debug "From observe.sh: CLAUDE_PROJECT_DIR: $CLAUDE_PROJECT_DIR" +log_debug "From observe.sh: PYTHON_CMD: $PYTHON_CMD" @@ -echo "date: $(date) From observe.sh: PROJECT_DIR: $PROJECT_DIR" >> ~/claude-hooks.log -echo "date: $(date) From observe.sh: CONFIG_DIR: $CONFIG_DIR" >> ~/claude-hooks.log +log_debug "date: $(date) From observe.sh: PROJECT_DIR: $PROJECT_DIR" +log_debug "date: $(date) From observe.sh: CONFIG_DIR: $CONFIG_DIR"Also applies to: 95-99, 240-242
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@skills/continuous-learning-v2/hooks/observe.sh` at line 17, The script currently appends an unconditional debug line (echo "From observe.sh: Hook fired $(date)" >> ~/claude-hooks.log) on every invocation, causing unbounded log growth; wrap this and the other similar occurrences (around lines 95-99 and 240-242) in a conditional that checks a debug flag (e.g. CLAUDE_HOOKS_DEBUG or DEBUG_HOOKS) and only writes the log when that env var is set/truthy, and consider switching to a rotating/size-limited logging mechanism (or use system logger) instead of blind appends; update the echo statements in observe.sh so they are executed only when the debug flag is enabled.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@skills/continuous-learning-v2/hooks/observe.sh`:
- Around line 92-103: The disable-check currently only tests
"${CONFIG_DIR}/disabled" where CONFIG_DIR is hardcoded to
"${HOME}/.claude/homunculus", but the script later loads configuration from
CLV2_CONFIG, so the disable guard can be bypassed; update the logic so
CONFIG_DIR is resolved from CLV2_CONFIG when present (or check both locations)
before testing for a "disabled" file—i.e., use the CLV2_CONFIG value to compute
the effective config directory (or explicitly test "${CLV2_CONFIG}/disabled" in
addition to "${CONFIG_DIR}/disabled") so that the early guard in the observe.sh
startup path respects the same config source used by the lazy-start code.
- Around line 243-263: The lazy-start race is caused by checking only for
${PROJECT_DIR}/.observer.pid before launching start-observer.sh; wrap the
check+start in an atomic lock (e.g., use mkdir-based lock or flock on a
lockfile) inside observe.sh so only one hook instance performs the startup
check, and within that critical section validate any existing .observer.pid by
reading the PID and confirming the process is running (if not running, remove
the stale .observer.pid) before calling "${SKILL_ROOT}/agents/start-observer.sh"
start; ensure the lock is reliably released (trap cleanup) so concurrent hooks
don't spawn duplicate observers and stale PID files are cleaned up.
---
Outside diff comments:
In `@skills/continuous-learning-v2/hooks/observe.sh`:
- Around line 266-275: The loop in observe.sh reads .observer.pid from both
"${PROJECT_DIR}" and "${CONFIG_DIR}" and may call kill -USR1 twice when both
files contain the same PID (observer_pid), causing duplicate triggers; fix by
deduplicating PIDs before signaling: maintain a small in-memory set/list of
already-signaled observer_pid values (e.g., signaled_pids) and check membership
before executing the kill -USR1 on observer_pid, only sending the signal once
per unique PID and still logging as currently done.
---
Nitpick comments:
In `@skills/continuous-learning-v2/hooks/observe.sh`:
- Line 17: The script currently appends an unconditional debug line (echo "From
observe.sh: Hook fired $(date)" >> ~/claude-hooks.log) on every invocation,
causing unbounded log growth; wrap this and the other similar occurrences
(around lines 95-99 and 240-242) in a conditional that checks a debug flag (e.g.
CLAUDE_HOOKS_DEBUG or DEBUG_HOOKS) and only writes the log when that env var is
set/truthy, and consider switching to a rotating/size-limited logging mechanism
(or use system logger) instead of blind appends; update the echo statements in
observe.sh so they are executed only when the debug flag is enabled.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 0c319b52-7c7c-44e1-93fe-3ff635ef5d98
📒 Files selected for processing (2)
skills/continuous-learning-v2/hooks/observe.shskills/continuous-learning-v2/scripts/detect-project.sh
💤 Files with no reviewable changes (1)
- skills/continuous-learning-v2/scripts/detect-project.sh
There was a problem hiding this comment.
6 issues found across 2 files
Prompt for AI agents (unresolved issues)
Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.
<file name="skills/continuous-learning-v2/hooks/observe.sh">
<violation number="1" location="skills/continuous-learning-v2/hooks/observe.sh:17">
P2: Add error handling (`2>/dev/null || true`) to debug logging statements to prevent hook termination when `~/claude-hooks.log` cannot be written to. Currently unguarded redirects with `set -e` may cause immediate exit if HOME is unavailable, read-only, or disk is full, breaking the observation hook's primary functionality.</violation>
<violation number="2" location="skills/continuous-learning-v2/hooks/observe.sh:17">
P1: Removal of non-human session guards can cause observer self-looping and metadata pollution. The 5-layer guard system (entrypoint check, hook profile check, skip env var, subagent detection, and path exclusions) that prevented the observer from observing its own sessions has been completely removed without being relocated. This allows automated/subagent/observer sessions to write observations and trigger lazy-start, potentially causing the observer to ingest its own activity and pollute project-scoped metadata.</violation>
<violation number="3" location="skills/continuous-learning-v2/hooks/observe.sh:17">
P2: This unconditional hook logging appends to `~/claude-hooks.log` on every invocation and can lead to unbounded log growth; guard these writes behind a debug flag or remove them for normal runs.</violation>
<violation number="4" location="skills/continuous-learning-v2/hooks/observe.sh:244">
P2: Lazy-start observer has race condition: concurrent hooks can start multiple observers</violation>
<violation number="5" location="skills/continuous-learning-v2/hooks/observe.sh:244">
P1: The lazy-start guard should also consider the global `${CONFIG_DIR}/.observer.pid` (and verify PID liveness), otherwise a second observer can be started while a global observer is already running.</violation>
<violation number="6" location="skills/continuous-learning-v2/hooks/observe.sh:251">
P2: Avoid interpolating `$CONFIG_FILE` directly into the Python `-c` source; a path containing quotes can break parsing and make this config read brittle. Pass the fallback path via an environment variable and read it from `os.environ` inside Python.</violation>
</file>
Since this is your first cubic review, here's how it works:
- cubic automatically reviews your code and comments on bugs and improvements
- Teach cubic by replying to its comments. cubic learns from your replies and gets better over time
- Add one-off context when rerunning by tagging
@cubic-dev-aiwith guidance or docs links (includingllms.txt) - Ask questions if you need clarification on any suggestion
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.
| # Hook phase from CLI argument: "pre" (PreToolUse) or "post" (PostToolUse) | ||
| HOOK_PHASE="${1:-post}" | ||
|
|
||
| echo "From observe.sh: Hook fired $(date)" >> ~/claude-hooks.log |
There was a problem hiding this comment.
P1: Removal of non-human session guards can cause observer self-looping and metadata pollution. The 5-layer guard system (entrypoint check, hook profile check, skip env var, subagent detection, and path exclusions) that prevented the observer from observing its own sessions has been completely removed without being relocated. This allows automated/subagent/observer sessions to write observations and trigger lazy-start, potentially causing the observer to ingest its own activity and pollute project-scoped metadata.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At skills/continuous-learning-v2/hooks/observe.sh, line 17:
<comment>Removal of non-human session guards can cause observer self-looping and metadata pollution. The 5-layer guard system (entrypoint check, hook profile check, skip env var, subagent detection, and path exclusions) that prevented the observer from observing its own sessions has been completely removed without being relocated. This allows automated/subagent/observer sessions to write observations and trigger lazy-start, potentially causing the observer to ingest its own activity and pollute project-scoped metadata.</comment>
<file context>
@@ -14,7 +14,7 @@ set -e
# Hook phase from CLI argument: "pre" (PreToolUse) or "post" (PostToolUse)
HOOK_PHASE="${1:-post}"
-
+echo "From observe.sh: Hook fired $(date)" >> ~/claude-hooks.log
# ─────────────────────────────────────────────
# Read stdin first (before project detection)
</file context>
| # Hook phase from CLI argument: "pre" (PreToolUse) or "post" (PostToolUse) | ||
| HOOK_PHASE="${1:-post}" | ||
|
|
||
| echo "From observe.sh: Hook fired $(date)" >> ~/claude-hooks.log |
There was a problem hiding this comment.
P2: Add error handling (2>/dev/null || true) to debug logging statements to prevent hook termination when ~/claude-hooks.log cannot be written to. Currently unguarded redirects with set -e may cause immediate exit if HOME is unavailable, read-only, or disk is full, breaking the observation hook's primary functionality.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At skills/continuous-learning-v2/hooks/observe.sh, line 17:
<comment>Add error handling (`2>/dev/null || true`) to debug logging statements to prevent hook termination when `~/claude-hooks.log` cannot be written to. Currently unguarded redirects with `set -e` may cause immediate exit if HOME is unavailable, read-only, or disk is full, breaking the observation hook's primary functionality.</comment>
<file context>
@@ -14,7 +14,7 @@ set -e
# Hook phase from CLI argument: "pre" (PreToolUse) or "post" (PostToolUse)
HOOK_PHASE="${1:-post}"
-
+echo "From observe.sh: Hook fired $(date)" >> ~/claude-hooks.log
# ─────────────────────────────────────────────
# Read stdin first (before project detection)
</file context>
| echo "From observe.sh: Hook fired $(date)" >> ~/claude-hooks.log | |
| echo "From observe.sh: Hook fired $(date)" >> ~/claude-hooks.log 2>/dev/null || true |
| # Hook phase from CLI argument: "pre" (PreToolUse) or "post" (PostToolUse) | ||
| HOOK_PHASE="${1:-post}" | ||
|
|
||
| echo "From observe.sh: Hook fired $(date)" >> ~/claude-hooks.log |
There was a problem hiding this comment.
P2: This unconditional hook logging appends to ~/claude-hooks.log on every invocation and can lead to unbounded log growth; guard these writes behind a debug flag or remove them for normal runs.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At skills/continuous-learning-v2/hooks/observe.sh, line 17:
<comment>This unconditional hook logging appends to `~/claude-hooks.log` on every invocation and can lead to unbounded log growth; guard these writes behind a debug flag or remove them for normal runs.</comment>
<file context>
@@ -14,7 +14,7 @@ set -e
# Hook phase from CLI argument: "pre" (PreToolUse) or "post" (PostToolUse)
HOOK_PHASE="${1:-post}"
-
+echo "From observe.sh: Hook fired $(date)" >> ~/claude-hooks.log
# ─────────────────────────────────────────────
# Read stdin first (before project detection)
</file context>
d9c3391 to
c53bba9
Compare
Auto-starts observer when observer.enabled: true in config and no .observer.pid exists. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
🧹 Nitpick comments (1)
skills/continuous-learning-v2/hooks/observe.sh (1)
279-284: Config file existence check doesn't coverCLV2_CONFIGoverride.Line 279 verifies
$CONFIG_FILE(${SKILL_ROOT}/config.json) exists, but line 282 reads fromCLV2_CONFIGif set—which might be a different file that was never validated. Also,open()without a context manager is inconsistent with the pattern used instart-observer.sh:79-92.Suggested fix
CONFIG_FILE="${SKILL_ROOT}/config.json" - if [ -f "$CONFIG_FILE" ] && [ -n "$PYTHON_CMD" ]; then + EFFECTIVE_CONFIG="${CLV2_CONFIG:-$CONFIG_FILE}" + if [ -f "$EFFECTIVE_CONFIG" ] && [ -n "$PYTHON_CMD" ]; then _enabled=$("$PYTHON_CMD" -c " import json, os -cfg = json.load(open(os.environ.get('CLV2_CONFIG', '$CONFIG_FILE'))) +with open('$EFFECTIVE_CONFIG') as f: + cfg = json.load(f) print(str(cfg.get('observer', {}).get('enabled', False)).lower()) " 2>/dev/null || echo "false")🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@skills/continuous-learning-v2/hooks/observe.sh` around lines 279 - 284, The existence check currently only verifies CONFIG_FILE but the Python snippet reads from CLV2_CONFIG when set, risking reading an unvalidated file; update the logic that sets _enabled to first determine the actual config path (use ${CLV2_CONFIG:-$CONFIG_FILE}) and verify that file exists before invoking PYTHON_CMD, and modify the Python one-liner to open the file using a context manager (with open(... ) as f) and load JSON from that file; keep the existing fallback to "false" and references to PYTHON_CMD and _enabled unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@skills/continuous-learning-v2/hooks/observe.sh`:
- Around line 279-284: The existence check currently only verifies CONFIG_FILE
but the Python snippet reads from CLV2_CONFIG when set, risking reading an
unvalidated file; update the logic that sets _enabled to first determine the
actual config path (use ${CLV2_CONFIG:-$CONFIG_FILE}) and verify that file
exists before invoking PYTHON_CMD, and modify the Python one-liner to open the
file using a context manager (with open(... ) as f) and load JSON from that
file; keep the existing fallback to "false" and references to PYTHON_CMD and
_enabled unchanged.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 66a48808-7b54-491b-bab2-1522251571ba
📒 Files selected for processing (1)
skills/continuous-learning-v2/hooks/observe.sh
There was a problem hiding this comment.
2 issues found across 1 file (changes from recent commits).
Prompt for AI agents (unresolved issues)
Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.
<file name="skills/continuous-learning-v2/hooks/observe.sh">
<violation number="1" location="skills/continuous-learning-v2/hooks/observe.sh:275">
P2: Guard the lazy-start check/start with a lock (e.g., flock) so concurrent hook invocations can’t all observe a missing PID and launch duplicate observers.</violation>
<violation number="2" location="skills/continuous-learning-v2/hooks/observe.sh:279">
P2: CLV2_CONFIG environment variable is unable to override the config path when the default config.json doesn't exist. The file existence check uses CONFIG_FILE (always the default) instead of checking the actual config path that will be used by the Python code.</violation>
</file>
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.
- Use flock for atomic check-then-act to prevent race conditions - Check both project-scoped AND global PID files before starting - Support CLV2_CONFIG override for config file path - Check disabled file in lazy-start logic - Use double-check pattern after acquiring lock Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
♻️ Duplicate comments (1)
skills/continuous-learning-v2/hooks/observe.sh (1)
298-307:⚠️ Potential issue | 🟡 MinorRace condition is fixed, but stale PID files still prevent observer restart.
The
flockimplementation properly addresses the concurrency race. However, lines 299 and 303 only check for PID file existence, not whether the process is actually running. If an observer crashes without cleaning up its PID file, lazy-start will be skipped forever.Consider adding stale PID cleanup inside the critical section:
🛡️ Proposed fix to clean stale PIDs before starting
if [ "$OBSERVER_ENABLED" = "true" ] && [ ! -f "${PROJECT_DIR}/.observer.pid" ] && [ ! -f "${CONFIG_DIR}/.observer.pid" ]; then ( flock -n 9 || exit 0 # Double-check PID files after acquiring lock - if [ ! -f "${PROJECT_DIR}/.observer.pid" ] && [ ! -f "${CONFIG_DIR}/.observer.pid" ]; then + # Clean up stale PID files first + for _pid_file in "${PROJECT_DIR}/.observer.pid" "${CONFIG_DIR}/.observer.pid"; do + if [ -f "$_pid_file" ]; then + _existing_pid="$(cat "$_pid_file" 2>/dev/null || true)" + if [ -n "$_existing_pid" ] && ! kill -0 "$_existing_pid" 2>/dev/null; then + rm -f "$_pid_file" + fi + fi + done + # Now check if we should start + if [ ! -f "${PROJECT_DIR}/.observer.pid" ] && [ ! -f "${CONFIG_DIR}/.observer.pid" ]; then "${SKILL_ROOT}/agents/start-observer.sh" start 2>/dev/null || true fi ) 9>"$LAZY_START_LOCK" fiNote: The initial check at line 299 must remain as-is (checking file existence) since it determines whether to attempt lock acquisition at all. The stale cleanup happens only after acquiring the lock.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@skills/continuous-learning-v2/hooks/observe.sh` around lines 298 - 307, The current lazy-start uses flock but only checks for PID file existence; inside the critical section (the block using 9>"$LAZY_START_LOCK") add logic to detect and remove stale PID files before launching the observer: after acquiring the lock and before calling "${SKILL_ROOT}/agents/start-observer.sh", read PID(s) from "${PROJECT_DIR}/.observer.pid" and "${CONFIG_DIR}/.observer.pid" (if present) and use a safe liveness check (e.g., kill -0 $PID) to determine if the PID is dead; remove the stale PID file(s) if the process is not running, then re-check existence and only call start-observer.sh if no live PID remains—leave the outer pre-check that gates lock acquisition unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@skills/continuous-learning-v2/hooks/observe.sh`:
- Around line 298-307: The current lazy-start uses flock but only checks for PID
file existence; inside the critical section (the block using
9>"$LAZY_START_LOCK") add logic to detect and remove stale PID files before
launching the observer: after acquiring the lock and before calling
"${SKILL_ROOT}/agents/start-observer.sh", read PID(s) from
"${PROJECT_DIR}/.observer.pid" and "${CONFIG_DIR}/.observer.pid" (if present)
and use a safe liveness check (e.g., kill -0 $PID) to determine if the PID is
dead; remove the stale PID file(s) if the process is not running, then re-check
existence and only call start-observer.sh if no live PID remains—leave the outer
pre-check that gates lock acquisition unchanged.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 8a6bea15-67c0-4054-8a56-e963c07e8619
📒 Files selected for processing (1)
skills/continuous-learning-v2/hooks/observe.sh
There was a problem hiding this comment.
2 issues found across 1 file (changes from recent commits).
Prompt for AI agents (unresolved issues)
Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.
<file name="skills/continuous-learning-v2/hooks/observe.sh">
<violation number="1" location="skills/continuous-learning-v2/hooks/observe.sh:299">
P1: Checking only PID file existence can block recovery after crashes. Validate that the PID in each file is actually alive (for example with `kill -0`) before deciding to skip lazy-start.</violation>
<violation number="2" location="skills/continuous-learning-v2/hooks/observe.sh:301">
P1: Lazy-start observer logic depends on `flock` which is not available on macOS/BSD by default, causing the auto-start feature to silently fail on those platforms. Consider adding a `command -v flock` guard with a fallback to the previous non-atomic check behavior.</violation>
</file>
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.
- Add stale PID cleanup via _CHECK_OBSERVER_RUNNING function - Add macOS fallback using lockfile when flock unavailable - Fix CLV2_CONFIG override: use EFFECTIVE_CONFIG for both check and read - Use proper Python context manager (with open() as f) - Deduplicate signaled PIDs to avoid duplicate USR1 signals Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (2)
skills/continuous-learning-v2/hooks/observe.sh (2)
292-303:⚠️ Potential issue | 🟠 Major
disabledsentinel check still diverges fromCLV2_CONFIGpath resolution.In this block,
EFFECTIVE_CONFIGis used for enabled-state reading, but the disabled check is still tied to${CONFIG_DIR}/disabled(hardcoded base dir). IfCLV2_CONFIGpoints elsewhere, disable behavior can still diverge from config behavior.Proposed fix
-if [ -f "${CONFIG_DIR}/disabled" ]; then +EFFECTIVE_CONFIG="${CLV2_CONFIG:-${SKILL_ROOT}/config.json}" +EFFECTIVE_CONFIG_DIR="$(dirname "$EFFECTIVE_CONFIG")" +if [ -f "${EFFECTIVE_CONFIG_DIR}/disabled" ]; then OBSERVER_ENABLED=false else OBSERVER_ENABLED=false - CONFIG_FILE="${SKILL_ROOT}/config.json" - # Allow CLV2_CONFIG override - if [ -n "${CLV2_CONFIG:-}" ]; then - CONFIG_FILE="$CLV2_CONFIG" - fi - # Use effective config path for both existence check and reading - EFFECTIVE_CONFIG="$CONFIG_FILE" + # Use effective config path for both existence check and reading if [ -f "$EFFECTIVE_CONFIG" ] && [ -n "$PYTHON_CMD" ]; then🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@skills/continuous-learning-v2/hooks/observe.sh` around lines 292 - 303, The current disabled-sentinel check uses ${CONFIG_DIR}/disabled while the script resolves CONFIG_FILE/EFFECTIVE_CONFIG (including CLV2_CONFIG override) later, causing divergence; move the EFFECTIVE_CONFIG/CONFIG_FILE resolution above the disabled check and then test for a sentinel in the same directory as EFFECTIVE_CONFIG (e.g., check "$(dirname "$EFFECTIVE_CONFIG")/disabled"), and set OBSERVER_ENABLED=false if that sentinel exists; update references to OBSERVER_ENABLED, CONFIG_FILE, CLV2_CONFIG, and EFFECTIVE_CONFIG accordingly so the disable behavior follows the resolved config path.
331-333:⚠️ Potential issue | 🟠 MajorLock is released before observer startup is confirmed, so duplicate starts remain possible.
Line 332 and Line 342 run
start-observer.shin background inside the lock. The lock can be released before.observer.pidis created, allowing a concurrent hook to start another observer.Proposed fix
- nohup "${SKILL_ROOT}/agents/start-observer.sh" start >/dev/null 2>&1 & + "${SKILL_ROOT}/agents/start-observer.sh" start >/dev/null 2>&1 || true @@ - nohup "${SKILL_ROOT}/agents/start-observer.sh" start >/dev/null 2>&1 & + "${SKILL_ROOT}/agents/start-observer.sh" start >/dev/null 2>&1 || trueAlso applies to: 341-343
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@skills/continuous-learning-v2/hooks/observe.sh` around lines 331 - 333, The lock in observe.sh is released before the observer process startup is confirmed, allowing a race that can spawn duplicate observers; modify the block that currently runs nohup "${SKILL_ROOT}/agents/start-observer.sh" start >/dev/null 2>&1 & so that either (a) start-observer.sh is invoked synchronously while holding the lock until it creates .observer.pid, or (b) after launching the background process you poll for the creation of ${PROJECT_DIR}/.observer.pid and ${CONFIG_DIR}/.observer.pid with a short timeout before releasing the lock; ensure the lock is only released once the pid file exists and reference the exact invocation of start-observer.sh and the .observer.pid files to locate and update the code path.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@skills/continuous-learning-v2/hooks/observe.sh`:
- Around line 278-289: The _CHECK_OBSERVER_RUNNING function currently trusts PID
file contents; validate the PID before any kill call by ensuring the read value
is a non-empty, all-digits, positive integer (e.g. use a shell regex like [[
"$pid" =~ ^[0-9]+$ ]] and check pid -ne 0) and only call kill -0 when that check
passes; if the PID is missing or invalid (non-numeric or zero) treat it as a
stale PID file, remove the file (rm -f "$pid_file") and return 1. Also apply the
same numeric validation wherever the code reads PID files around the observe
start/stop logic (the other PID-read sites referenced) to avoid passing
malformed values like -1 or abc to kill.
---
Duplicate comments:
In `@skills/continuous-learning-v2/hooks/observe.sh`:
- Around line 292-303: The current disabled-sentinel check uses
${CONFIG_DIR}/disabled while the script resolves CONFIG_FILE/EFFECTIVE_CONFIG
(including CLV2_CONFIG override) later, causing divergence; move the
EFFECTIVE_CONFIG/CONFIG_FILE resolution above the disabled check and then test
for a sentinel in the same directory as EFFECTIVE_CONFIG (e.g., check "$(dirname
"$EFFECTIVE_CONFIG")/disabled"), and set OBSERVER_ENABLED=false if that sentinel
exists; update references to OBSERVER_ENABLED, CONFIG_FILE, CLV2_CONFIG, and
EFFECTIVE_CONFIG accordingly so the disable behavior follows the resolved config
path.
- Around line 331-333: The lock in observe.sh is released before the observer
process startup is confirmed, allowing a race that can spawn duplicate
observers; modify the block that currently runs nohup
"${SKILL_ROOT}/agents/start-observer.sh" start >/dev/null 2>&1 & so that either
(a) start-observer.sh is invoked synchronously while holding the lock until it
creates .observer.pid, or (b) after launching the background process you poll
for the creation of ${PROJECT_DIR}/.observer.pid and ${CONFIG_DIR}/.observer.pid
with a short timeout before releasing the lock; ensure the lock is only released
once the pid file exists and reference the exact invocation of start-observer.sh
and the .observer.pid files to locate and update the code path.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: ba712604-867b-48b2-80a5-3786da5da5ce
📒 Files selected for processing (1)
skills/continuous-learning-v2/hooks/observe.sh
There was a problem hiding this comment.
3 issues found across 1 file (changes from recent commits).
Prompt for AI agents (unresolved issues)
Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.
<file name="skills/continuous-learning-v2/hooks/observe.sh">
<violation number="1" location="skills/continuous-learning-v2/hooks/observe.sh:332">
P1:</violation>
<violation number="2" location="skills/continuous-learning-v2/hooks/observe.sh:338">
P2: Avoid `exit 0` in the parent shell here; on lock contention it terminates the whole hook and skips the observer signal pass for this invocation.</violation>
<violation number="3" location="skills/continuous-learning-v2/hooks/observe.sh:338">
P1: Harden the macOS `lockfile` path against stale locks. If the script exits between acquisition and cleanup, `.observer-start.lock` can be left behind and prevent future lazy-start attempts.</violation>
</file>
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.
- Wrap lockfile block in subshell so exit 0 only terminates that block - Add trap for EXIT to clean up lock file on script interruption - Add -l 30 (30 second expiry) to prevent permanent lock file stuck Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
1 issue found across 1 file (changes from recent commits).
Prompt for AI agents (unresolved issues)
Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.
<file name="skills/continuous-learning-v2/hooks/observe.sh">
<violation number="1" location="skills/continuous-learning-v2/hooks/observe.sh:340">
P0: Race condition: The EXIT trap is set before attempting to acquire the lockfile, so if lockfile fails (because another process holds the lock), the trap will delete another process's active lock when the subshell exits. This breaks the mutual-exclusion guarantee and can cause duplicate observer starts.</violation>
</file>
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.
| if command -v lockfile >/dev/null 2>&1; then | ||
| # Use subshell to isolate exit and add trap for cleanup | ||
| ( | ||
| trap 'rm -f "$LAZY_START_LOCK" 2>/dev/null || true' EXIT |
There was a problem hiding this comment.
P0: Race condition: The EXIT trap is set before attempting to acquire the lockfile, so if lockfile fails (because another process holds the lock), the trap will delete another process's active lock when the subshell exits. This breaks the mutual-exclusion guarantee and can cause duplicate observer starts.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At skills/continuous-learning-v2/hooks/observe.sh, line 340:
<comment>Race condition: The EXIT trap is set before attempting to acquire the lockfile, so if lockfile fails (because another process holds the lock), the trap will delete another process's active lock when the subshell exits. This breaks the mutual-exclusion guarantee and can cause duplicate observer starts.</comment>
<file context>
@@ -335,13 +335,17 @@ if [ "$OBSERVER_ENABLED" = "true" ]; then
- rm -f "$LAZY_START_LOCK" 2>/dev/null || true
+ # Use subshell to isolate exit and add trap for cleanup
+ (
+ trap 'rm -f "$LAZY_START_LOCK" 2>/dev/null || true' EXIT
+ lockfile -r 1 -l 30 "$LAZY_START_LOCK" 2>/dev/null || exit 0
+ _CHECK_OBSERVER_RUNNING "${PROJECT_DIR}/.observer.pid" || true
</file context>
| ( | ||
| flock -n 9 || exit 0 | ||
| # Double-check PID files after acquiring lock | ||
| _CHECK_OBSERVER_RUNNING "${PROJECT_DIR}/.observer.pid" || true | ||
| _CHECK_OBSERVER_RUNNING "${CONFIG_DIR}/.observer.pid" || true | ||
| if [ ! -f "${PROJECT_DIR}/.observer.pid" ] && [ ! -f "${CONFIG_DIR}/.observer.pid" ]; then | ||
| nohup "${SKILL_ROOT}/agents/start-observer.sh" start >/dev/null 2>&1 & | ||
| fi | ||
| ) 9>"$LAZY_START_LOCK" |
There was a problem hiding this comment.
flock released before PID file is written — observer window race remains
The flock subshell exits (releasing the lock on line 334) immediately after backgrounding start-observer.sh with nohup ... &. At that point, start-observer.sh is still initializing — it launches observer-loop.sh in the background and only then sleep 2s before checking for the PID file. The PID file is written by observer-loop.sh asynchronously.
This creates a window of potentially 1–2+ seconds where:
- The flock is released (
flock -n 9is gone) - No PID file exists yet
- The next hook invocation acquires the lock, passes the double-check inside the subshell, and launches a second
start-observer.sh start
Because start-observer.sh only guards against duplicates by checking $PID_FILE (line 175 of start-observer.sh) — and the PID file doesn't yet exist — each observer instance in this window starts a new observer-loop.sh subprocess.
A robust fix is to write a "starting" sentinel inside the lock before backgrounding, and check it alongside the PID files:
LAZY_STARTING_SENTINEL="${PROJECT_DIR}/.observer-starting"
...
(
flock -n 9 || exit 0
_CHECK_OBSERVER_RUNNING "${PROJECT_DIR}/.observer.pid" || true
_CHECK_OBSERVER_RUNNING "${CONFIG_DIR}/.observer.pid" || true
if [ ! -f "${PROJECT_DIR}/.observer.pid" ] && \
[ ! -f "${CONFIG_DIR}/.observer.pid" ] && \
[ ! -f "$LAZY_STARTING_SENTINEL" ]; then
touch "$LAZY_STARTING_SENTINEL"
nohup bash -c '"'"'
"${SKILL_ROOT}/agents/start-observer.sh" start >/dev/null 2>&1
rm -f "$LAZY_STARTING_SENTINEL"
'"'"' >/dev/null 2>&1 &
fi
) 9>"$LAZY_START_LOCK"The sentinel is checked (inside the lock) by all concurrent invocations and prevents a second launch; once start-observer.sh finishes and the PID file is stable the sentinel is removed.
There was a problem hiding this comment.
♻️ Duplicate comments (3)
skills/continuous-learning-v2/hooks/observe.sh (3)
292-303:⚠️ Potential issue | 🟠 MajorPropagate
CLV2_CONFIGconsistently across disable checks, PID lookups, and startup.Lines 296-302 only switch the
observer.enabledread.CONFIG_DIRstill stays on${HOME}/.claude/homunculus, so thedisabledsentinel and global PID checks keep using the default directory, andskills/continuous-learning-v2/agents/start-observer.shstill resolves its own default config file. With a customCLV2_CONFIG, lazy-start can read one config but signal/start a different observer.♻️ Suggested direction
- CONFIG_FILE="${SKILL_ROOT}/config.json" - # Allow CLV2_CONFIG override - if [ -n "${CLV2_CONFIG:-}" ]; then - CONFIG_FILE="$CLV2_CONFIG" - fi - # Use effective config path for both existence check and reading - EFFECTIVE_CONFIG="$CONFIG_FILE" + DEFAULT_CONFIG_FILE="${SKILL_ROOT}/config.json" + EFFECTIVE_CONFIG="${CLV2_CONFIG:-$DEFAULT_CONFIG_FILE}" + CONFIG_DIR="$(dirname "$EFFECTIVE_CONFIG")"Apply the same effective config resolution to the earlier disabled guard and to
skills/continuous-learning-v2/agents/start-observer.sh.Also applies to: 316-323, 332-345
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@skills/continuous-learning-v2/hooks/observe.sh` around lines 292 - 303, The script computes EFFECTIVE_CONFIG too late so checks for the disabled sentinel and PID/start logic still use the default CONFIG_DIR and CONFIG_FILE; move or duplicate the CLV2_CONFIG override and EFFECTIVE_CONFIG resolution earlier and use EFFECTIVE_CONFIG (not CONFIG_DIR or the original CONFIG_FILE) for the disabled file check, PID lookups, OBSERVER_ENABLED decisions, and when invoking skills/continuous-learning-v2/agents/start-observer.sh; also update start-observer.sh to accept/resolve CLV2_CONFIG or an explicit config path so the started observer uses the same EFFECTIVE_CONFIG value.
331-345:⚠️ Potential issue | 🟠 MajorKeep the startup lock until the observer PID is visible.
Line 332 and Line 345 background
start-observer.sh, butskills/continuous-learning-v2/agents/start-observer.sh:195-211already daemonizes and returns beforeskills/continuous-learning-v2/agents/observer-loop.sh:137-138writes the PID file. That leaves a window where the lock is gone and the next hook can start a second observer.🔒 Suggested fix
- nohup "${SKILL_ROOT}/agents/start-observer.sh" start >/dev/null 2>&1 & + "${SKILL_ROOT}/agents/start-observer.sh" start >/dev/null 2>&1 || true + for _ in 1 2 3 4 5; do + _CHECK_OBSERVER_RUNNING "${PROJECT_DIR}/.observer.pid" && break + _CHECK_OBSERVER_RUNNING "${CONFIG_DIR}/.observer.pid" && break + sleep 1 + doneApply the same change in the macOS fallback branch.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@skills/continuous-learning-v2/hooks/observe.sh` around lines 331 - 345, The macOS fallback branch starts start-observer.sh in background and releases LAZY_START_LOCK too early; modify the lockfile branch (inside the subshell where lockfile -r 1 -l 30 "$LAZY_START_LOCK" is used) to mirror the Linux branch behavior: after launching nohup "${SKILL_ROOT}/agents/start-observer.sh" start >/dev/null 2>&1 &, do a short polling loop (with a reasonable timeout) that checks for either "${PROJECT_DIR}/.observer.pid" or "${CONFIG_DIR}/.observer.pid" (reuse _CHECK_OBSERVER_RUNNING or plain -f checks) and only allow the subshell to exit (and thus remove the LAZY_START_LOCK via the existing EXIT trap) once one of those PID files is present or the timeout elapses; keep the existing trap and ensure the lock is still held during the wait so a second observer cannot be started.
278-289:⚠️ Potential issue | 🔴 CriticalReject unreadable or malformed PID files before calling
kill.Line 282 and Line 358 can fail under
set -eif another hook removes the file after the-fcheck, and the current[ -n "$pid" ]gate still lets values like-1,0, orabcreachkill. Treat unreadable or non-numeric PIDs as stale in both paths.🛡️ Suggested fix
_CHECK_OBSERVER_RUNNING() { local pid_file="$1" if [ -f "$pid_file" ]; then local pid - pid=$(cat "$pid_file" 2>/dev/null) - if [ -n "$pid" ] && kill -0 "$pid" 2>/dev/null; then + pid=$(cat "$pid_file" 2>/dev/null || true) + case "$pid" in + ''|*[!0-9]*|0|1) + rm -f "$pid_file" 2>/dev/null || true + return 1 + ;; + esac + if kill -0 "$pid" 2>/dev/null; then return 0 # Process is alive fi # Stale PID file - remove it rm -f "$pid_file" 2>/dev/null || true fi @@ for pid_file in "${PROJECT_DIR}/.observer.pid" "${CONFIG_DIR}/.observer.pid"; do if [ -f "$pid_file" ]; then - observer_pid=$(cat "$pid_file") + observer_pid=$(cat "$pid_file" 2>/dev/null || true) + case "$observer_pid" in + ''|*[!0-9]*|0|1) + rm -f "$pid_file" 2>/dev/null || true + continue + ;; + esac # Deduplicate: skip if already signaled this pass case "$signaled_pids" in *" $observer_pid "*) continue ;; esacAlso applies to: 356-365
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@skills/continuous-learning-v2/hooks/observe.sh` around lines 278 - 289, The PID handling in _CHECK_OBSERVER_RUNNING allows unreadable or malformed values to reach kill and can fail under set -e if the file is removed between the -f check and reading; replace the current cat/kill flow with a safe read and numeric validation: use read -r pid < "$pid_file" 2>/dev/null (or a subshell read) and if the read fails treat the PID file as stale and rm -f it, then validate pid with a regex like ^[0-9]+$ and ensure pid is greater than 0 before calling kill -0; if validation fails remove the pid file and return 1. Apply the same robust read+numeric check logic to the other similar block referenced (the logic around lines handling PID file at the other path / function).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@skills/continuous-learning-v2/hooks/observe.sh`:
- Around line 292-303: The script computes EFFECTIVE_CONFIG too late so checks
for the disabled sentinel and PID/start logic still use the default CONFIG_DIR
and CONFIG_FILE; move or duplicate the CLV2_CONFIG override and EFFECTIVE_CONFIG
resolution earlier and use EFFECTIVE_CONFIG (not CONFIG_DIR or the original
CONFIG_FILE) for the disabled file check, PID lookups, OBSERVER_ENABLED
decisions, and when invoking
skills/continuous-learning-v2/agents/start-observer.sh; also update
start-observer.sh to accept/resolve CLV2_CONFIG or an explicit config path so
the started observer uses the same EFFECTIVE_CONFIG value.
- Around line 331-345: The macOS fallback branch starts start-observer.sh in
background and releases LAZY_START_LOCK too early; modify the lockfile branch
(inside the subshell where lockfile -r 1 -l 30 "$LAZY_START_LOCK" is used) to
mirror the Linux branch behavior: after launching nohup
"${SKILL_ROOT}/agents/start-observer.sh" start >/dev/null 2>&1 &, do a short
polling loop (with a reasonable timeout) that checks for either
"${PROJECT_DIR}/.observer.pid" or "${CONFIG_DIR}/.observer.pid" (reuse
_CHECK_OBSERVER_RUNNING or plain -f checks) and only allow the subshell to exit
(and thus remove the LAZY_START_LOCK via the existing EXIT trap) once one of
those PID files is present or the timeout elapses; keep the existing trap and
ensure the lock is still held during the wait so a second observer cannot be
started.
- Around line 278-289: The PID handling in _CHECK_OBSERVER_RUNNING allows
unreadable or malformed values to reach kill and can fail under set -e if the
file is removed between the -f check and reading; replace the current cat/kill
flow with a safe read and numeric validation: use read -r pid < "$pid_file"
2>/dev/null (or a subshell read) and if the read fails treat the PID file as
stale and rm -f it, then validate pid with a regex like ^[0-9]+$ and ensure pid
is greater than 0 before calling kill -0; if validation fails remove the pid
file and return 1. Apply the same robust read+numeric check logic to the other
similar block referenced (the logic around lines handling PID file at the other
path / function).
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 2b42327f-5973-4aef-8944-1f492c95c6e5
📒 Files selected for processing (1)
skills/continuous-learning-v2/hooks/observe.sh
- Validate PID is a positive integer before kill calls to prevent signaling invalid targets (e.g. -1 could signal all processes) - Pass config path via env var instead of interpolating shell variable into Python -c string to prevent injection/breakage on special paths - Check CLV2_CONFIG-derived directory for disabled file so disable guard respects the same config source as lazy-start Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
| _CHECK_OBSERVER_RUNNING "${PROJECT_DIR}/.observer.pid" || true | ||
| _CHECK_OBSERVER_RUNNING "${CONFIG_DIR}/.observer.pid" || true | ||
| if [ ! -f "${PROJECT_DIR}/.observer.pid" ] && [ ! -f "${CONFIG_DIR}/.observer.pid" ]; then | ||
| nohup "${SKILL_ROOT}/agents/start-observer.sh" start >/dev/null 2>&1 & |
There was a problem hiding this comment.
CLV2_CONFIG not forwarded to start-observer.sh
When a custom config is active via CLV2_CONFIG, observe.sh correctly reads observer.enabled: true from it and sets OBSERVER_ENABLED=true. However, when it calls start-observer.sh start (line 342), it does NOT forward CLV2_CONFIG in any effective way — start-observer.sh hardcodes CONFIG_FILE="${SKILL_ROOT}/config.json" (line 39 of start-observer.sh) and never inspects CLV2_CONFIG when choosing which config file to load.
If the default config.json has observer.enabled: false (or is missing), start-observer.sh will reach its guard on line 168:
if [ "$OBSERVER_ENABLED" != "true" ]; then
echo "Observer is disabled in config.json..."
exit 1
fiand exit with code 1 — but because stdout and stderr are both redirected to /dev/null, this failure is completely silent. The observer never starts, and there is no indication of why.
The fix requires either passing the config path explicitly to start-observer.sh, or making start-observer.sh aware of CLV2_CONFIG. A minimal approach:
nohup env CLV2_CONFIG="${EFFECTIVE_CONFIG}" "${SKILL_ROOT}/agents/start-observer.sh" start >/dev/null 2>&1 &The same fix is needed on line 355 in the lockfile path.
There was a problem hiding this comment.
1 issue found across 1 file (changes from recent commits).
Prompt for AI agents (unresolved issues)
Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.
<file name="skills/continuous-learning-v2/hooks/observe.sh">
<violation number="1" location="skills/continuous-learning-v2/hooks/observe.sh:293">
P1: PID reuse vulnerability: The observer lifecycle only validates PID format, not process identity. A stale PID file combined with PID reuse can cause SIGUSR1 to be sent to unrelated processes, potentially killing them and leaving the real observer unstarted.</violation>
</file>
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.
| return 1 | ||
| ;; | ||
| esac | ||
| if kill -0 "$pid" 2>/dev/null; then |
There was a problem hiding this comment.
P1: PID reuse vulnerability: The observer lifecycle only validates PID format, not process identity. A stale PID file combined with PID reuse can cause SIGUSR1 to be sent to unrelated processes, potentially killing them and leaving the real observer unstarted.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At skills/continuous-learning-v2/hooks/observe.sh, line 293:
<comment>PID reuse vulnerability: The observer lifecycle only validates PID format, not process identity. A stale PID file combined with PID reuse can cause SIGUSR1 to be sent to unrelated processes, potentially killing them and leaving the real observer unstarted.</comment>
<file context>
@@ -280,7 +283,14 @@ _CHECK_OBSERVER_RUNNING() {
+ return 1
+ ;;
+ esac
+ if kill -0 "$pid" 2>/dev/null; then
return 0 # Process is alive
fi
</file context>
Merge upstream/main which includes: - fix: observer memory explosion with throttling and tail sampling (affaan-m#536) - fix: observer hooks hardening, secret scrubbing (affaan-m#348) - fix: lazy-start observer logic (affaan-m#508) - fix: read tool_response field in observe.sh (affaan-m#377) - feat: add C++, Java, Rust, PyTorch language support - feat: SQLite state store, skill evolution, session adapters - feat: orchestration harness and selective install - feat: DevFleet multi-agent orchestration skill Conflict resolution: - observe.sh: kept our hook_event_name + tool_response fixes, merged upstream's $PYTHON_CMD, secret scrubbing, lazy-start, throttled signaling, and session guards - package.json: accepted upstream's new scripts and dependencies
) * feat(continuous-learning-v2): add lazy-start observer logic Auto-starts observer when observer.enabled: true in config and no .observer.pid exists. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(continuous-learning-v2): address PR review concerns - Use flock for atomic check-then-act to prevent race conditions - Check both project-scoped AND global PID files before starting - Support CLV2_CONFIG override for config file path - Check disabled file in lazy-start logic - Use double-check pattern after acquiring lock Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(observe.sh): address PR review comments - Add stale PID cleanup via _CHECK_OBSERVER_RUNNING function - Add macOS fallback using lockfile when flock unavailable - Fix CLV2_CONFIG override: use EFFECTIVE_CONFIG for both check and read - Use proper Python context manager (with open() as f) - Deduplicate signaled PIDs to avoid duplicate USR1 signals Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(observe.sh): wrap macOS lockfile fallback in subshell with trap - Wrap lockfile block in subshell so exit 0 only terminates that block - Add trap for EXIT to clean up lock file on script interruption - Add -l 30 (30 second expiry) to prevent permanent lock file stuck Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(observe.sh): address remaining PR review comments - Validate PID is a positive integer before kill calls to prevent signaling invalid targets (e.g. -1 could signal all processes) - Pass config path via env var instead of interpolating shell variable into Python -c string to prevent injection/breakage on special paths - Check CLV2_CONFIG-derived directory for disabled file so disable guard respects the same config source as lazy-start Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Summary
observer.enabled: truein config and no.observer.pidexistsTest plan
observer.enabled: truein config.json.observer.pidfileSummary by CodeRabbit
Improvements
Bug Fixes