Skip to content

fix(continuous-learning-v2): prevent observer memory explosion#536

Merged
affaan-m merged 1 commit into
mainfrom
fix/observer-memory
Mar 16, 2026
Merged

fix(continuous-learning-v2): prevent observer memory explosion#536
affaan-m merged 1 commit into
mainfrom
fix/observer-memory

Conversation

@affaan-m

@affaan-m affaan-m commented Mar 16, 2026

Copy link
Copy Markdown
Owner

Summary

Fixes #521 — the background observer agent caused memory explosion and crash due to a positive feedback loop between observe.sh SIGUSR1 signaling and observer-loop.sh analysis.

Three root causes addressed:

  • SIGUSR1 throttling (observe.sh): Every tool call was sending a signal to the observer. Now uses a counter file to only signal every 20 observations (configurable via ECC_OBSERVER_SIGNAL_EVERY_N).
  • Re-entrancy guard (observer-loop.sh on_usr1()): Added ANALYZING flag that prevents new analysis from spawning while one is already running. Previously, signals during analysis spawned parallel Claude processes.
  • Cooldown + tail-based sampling (observer-loop.sh): 60s cooldown between analyses (ECC_OBSERVER_ANALYSIS_COOLDOWN). Only sends the last 500 lines to the LLM (ECC_OBSERVER_MAX_ANALYSIS_LINES) instead of the entire 4.8MB file.

Test plan

  • 17 new tests in tests/hooks/observer-memory.test.js — all passing
  • Manual: Enable observer, run a session with rapid tool calls, verify no parallel Claude processes spawn
  • Manual: Verify observations.jsonl stays bounded and archives rotate correctly
  • Verify existing hook tests still pass (node tests/run-all.js)

Summary by cubic

Prevents the observer’s memory explosion by breaking the SIGUSR1/analysis feedback loop. Adds throttling, a re-entrancy guard, a cooldown, and tail-based sampling to keep analyses bounded. Fixes #521.

  • Bug Fixes
    • Throttle SIGUSR1 in observe.sh: signal every N events (default 20 via ECC_OBSERVER_SIGNAL_EVERY_N) using a .observer-signal-counter file.
    • Add re-entrancy guard in observer-loop.sh: ANALYZING flag blocks overlapping analyses; add 60s cooldown (ECC_OBSERVER_ANALYSIS_COOLDOWN).
    • Tail-based sampling in observer-loop.sh: analyze only the last 500 lines (ECC_OBSERVER_MAX_ANALYSIS_LINES) via a temp file; clean up after run.
    • Tests: 17 new cases in tests/hooks/observer-memory.test.js covering throttling, sampling, and re-entrancy.

Written for commit 3a26747. Summary will update on new commits.

Summary by CodeRabbit

  • Bug Fixes

    • Added re-entrancy guard to prevent concurrent analyses
    • Implemented cooldown throttling for analysis triggers
    • Added signal throttling to reduce observation processing frequency
    • Enabled tail-based sampling for improved memory efficiency
  • Tests

    • Added comprehensive test suite validating observer throttling, memory management, and analysis workflow

Three fixes for the positive feedback loop causing runaway memory usage:

1. SIGUSR1 throttling in observe.sh: Signal observer only every 20
   observations (configurable via ECC_OBSERVER_SIGNAL_EVERY_N) instead
   of on every tool call. Uses a counter file to track invocations.

2. Re-entrancy guard in observer-loop.sh on_usr1(): ANALYZING flag
   prevents parallel Claude analysis processes from spawning when
   signals arrive while analysis is already running.

3. Cooldown + tail-based sampling in observer-loop.sh:
   - 60s cooldown between analyses (ECC_OBSERVER_ANALYSIS_COOLDOWN)
   - Only last 500 lines sent to LLM (ECC_OBSERVER_MAX_ANALYSIS_LINES)
     instead of the entire observations file

Closes #521
@coderabbitai

coderabbitai Bot commented Mar 16, 2026

Copy link
Copy Markdown
Contributor

Caution

Review failed

The pull request is closed.

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: df36998f-d835-4795-b392-238a2f1d37a3

📥 Commits

Reviewing files that changed from the base of the PR and between bb27dde and 3a26747.

📒 Files selected for processing (3)
  • skills/continuous-learning-v2/agents/observer-loop.sh
  • skills/continuous-learning-v2/hooks/observe.sh
  • tests/hooks/observer-memory.test.js

📝 Walkthrough

Walkthrough

This PR implements throttling and batching mechanisms to prevent the feedback loop causing memory exhaustion in the observer system. Changes include signal throttling via counter-based gating in observe.sh, re-entrancy guards and cooldown constraints in observer-loop.sh, tail-based sampling of observations, and comprehensive test coverage validating all new mechanisms.

Changes

Cohort / File(s) Summary
Observer Loop Throttling & Sampling
skills/continuous-learning-v2/agents/observer-loop.sh
Introduces re-entrancy guard (ANALYZING flag), cooldown mechanism (ANALYSIS_COOLDOWN, LAST_ANALYSIS_EPOCH), tail-based sampling (MAX_ANALYSIS_LINES) to limit analysis scope, and cleanup of analysis temp files. Enhanced on_usr1 handler enforces guards before and resets state after analysis.
Signal Throttling
skills/continuous-learning-v2/hooks/observe.sh
Replaces unconditional SIGUSR1 signaling with counter-based gating using a persistent counter file. Signals only when counter reaches SIGNAL_EVERY_N threshold (default 20), then resets, decoupling signal frequency from observation rate.
Memory Management Test Suite
tests/hooks/observer-memory.test.js
New comprehensive test file validating re-entrancy guards, cooldown enforcement, tail-based sampling, signal throttling counter behavior, temp file cleanup, and end-to-end observer behavior under normal and edge cases.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related issues

  • Memory Explosion in Continuous Learning v2 Observer #521 — This PR directly addresses all three root causes of the memory explosion: implements signal throttling (no more every-call SIGUSR1), adds observation batching via tail sampling (no longer reads entire file), and introduces re-entrancy guards (prevents concurrent analyses).

Possibly related PRs

Poem

🐰 A counter clicks, a guard stands tall,
No more signals answering every call,
Observations sampled, memories freed,
The observer breathes at measured speed! 🎯

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch fix/observer-memory
📝 Coding Plan
  • Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@affaan-m affaan-m merged commit f9e8287 into main Mar 16, 2026
3 of 39 checks passed
@greptile-apps

greptile-apps Bot commented Mar 16, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

This PR addresses the observer memory explosion from issue #521 by introducing three complementary throttling mechanisms: SIGUSR1 signal throttling in observe.sh (every 20 tool calls), a re-entrancy guard (ANALYZING flag) and cooldown timer in observer-loop.sh's on_usr1 handler, and tail-based sampling (last 500 lines) to cap the LLM payload size. The changes integrate cleanly with the existing session-guardian.sh and lock-file patterns already established in the skill.

Key issues found:

  • Re-entrancy guard has a gap: The ANALYZING=1 flag is only set within on_usr1(). The main while loop calls analyze_observations directly (line 183) without setting the flag. If a SIGUSR1 arrives during a scheduled (non-signal-triggered) analysis — specifically during its wait "$claude_pid"on_usr1 will see ANALYZING=0, pass the guard, and spawn a second concurrent Claude process. LAST_ANALYSIS_EPOCH is also not updated from this path, so the cooldown does not provide a backstop here either.

  • Signal counter file is unprotected against concurrent writes: The read-increment-write on .observer-signal-counter in observe.sh is not atomic. Concurrent hook invocations can both read the same counter value, both evaluate >= SIGNAL_EVERY_N, and both send kill -USR1, sending two simultaneous signals — which was the root cause of the original explosion.

  • Cooldown timer starts even on skipped analyses: LAST_ANALYSIS_EPOCH is updated after every analyze_observations call regardless of whether it did any work (e.g., not enough observations, claude not found), which can unnecessarily delay legitimate analyses.

  • Tests are structural, not behavioral: All 17 tests validate the presence of code patterns via string matching rather than exercising the runtime signal-handling and concurrency semantics, so the residual race conditions above are not caught.

Confidence Score: 2/5

  • The PR improves on the original explosion scenario but leaves a residual re-entrancy window in the scheduled-analysis path and an unguarded race condition on the counter file.
  • The three mechanisms (throttle, guard, tail-sampling) are sound in concept and address the most common trigger path. However, the ANALYZING flag is never set in the main loop's direct call to analyze_observations, meaning a SIGUSR1 arriving during scheduled analysis can still spawn parallel Claude processes — the exact scenario that caused Memory Explosion in Continuous Learning v2 Observer #521. The unprotected counter file write adds a second concurrent path to the same outcome. Both are logic-level bugs rather than edge-case style concerns.
  • Pay close attention to observer-loop.sh (main while-loop else branch, lines 180-184) and observe.sh (counter read-write block, lines 374-385).

Important Files Changed

Filename Overview
skills/continuous-learning-v2/agents/observer-loop.sh Adds re-entrancy guard (ANALYZING flag), cooldown timer, and tail-based sampling to prevent runaway parallel Claude processes. However, the ANALYZING guard is only set in on_usr1() — the main loop's direct analyze_observations call bypasses it entirely, leaving a residual re-entrancy window during scheduled (non-signal-triggered) analyses.
skills/continuous-learning-v2/hooks/observe.sh Adds SIGUSR1 throttling via a per-project counter file — signals the observer only every 20 tool calls. The counter file read-increment-write is not atomic; concurrent observe.sh invocations (parallel tool calls) can both read the same counter value and both send a signal simultaneously, partially recreating the original memory explosion scenario.
tests/hooks/observer-memory.test.js 17 new static/structural tests validating the three fix mechanisms. Tests are primarily pattern-matching against file contents rather than runtime behavioral tests, so they validate the code is present but do not exercise the actual race conditions or re-entrancy scenarios end-to-end.

Sequence Diagram

sequenceDiagram
    participant Hook as observe.sh (hook)
    participant Counter as .observer-signal-counter
    participant Observer as observer-loop.sh (bg)
    participant Claude as claude (analysis)

    Note over Hook,Counter: Every tool call
    Hook->>Counter: read counter
    Counter-->>Hook: N
    Hook->>Counter: write N+1
    alt N+1 >= SIGNAL_EVERY_N (20)
        Hook->>Counter: write 0 (reset)
        Hook->>Observer: kill -USR1
        Note over Observer: on_usr1() fires
        alt ANALYZING == 1
            Observer-->>Observer: skip (re-entrancy guard)
        else elapsed < ANALYSIS_COOLDOWN (60s)
            Observer-->>Observer: skip (cooldown)
        else
            Observer->>Observer: ANALYZING=1
            Observer->>Observer: tail last 500 lines → analysis_file
            Observer->>Claude: spawn claude --print < prompt_file
            Note over Claude: runs up to 120s / 10 turns
            Claude-->>Observer: exit
            Observer->>Observer: LAST_ANALYSIS_EPOCH = now
            Observer->>Observer: ANALYZING=0
            Observer->>Observer: archive observations.jsonl
        end
    end

    Note over Observer: Periodic (every 300s, main loop)
    Observer->>Observer: analyze_observations()
    Note over Observer: ⚠️ ANALYZING flag NOT set here
    Observer->>Claude: spawn claude
    alt SIGUSR1 fires during wait
        Hook->>Observer: kill -USR1
        Note over Observer: ANALYZING==0 → guard bypassed!
        Observer->>Claude: spawn 2nd concurrent claude
    end
Loading

Comments Outside Diff (1)

  1. skills/continuous-learning-v2/agents/observer-loop.sh, line 180-184 (link)

    Re-entrancy guard bypassed by scheduled (non-signal) analysis

    The ANALYZING flag is set in on_usr1() (line 164), but the main while loop's else branch calls analyze_observations directly at line 183 without setting ANALYZING=1 first.

    This means the guard is only effective when two SIGUSR1-triggered analyses overlap. If a SIGUSR1 fires while the scheduled periodic analysis is running — specifically during its internal wait "$claude_pid" call — on_usr1 will check ANALYZING (still 0), pass the guard, and spawn a second concurrent Claude process. This is exactly the race condition the fix for Memory Explosion in Continuous Learning v2 Observer #521 intended to prevent.

    The fix is to wrap the main loop's direct call the same way:

      if [ "$USR1_FIRED" -eq 1 ]; then
        USR1_FIRED=0
      else
        ANALYZING=1
        analyze_observations
        LAST_ANALYSIS_EPOCH=$(date +%s)
        ANALYZING=0
      fi
    

    Note that LAST_ANALYSIS_EPOCH is also not updated from this path currently, so the cooldown in on_usr1 would not protect against it either (elapsed time since epoch 0 is always enormous, so the cooldown check always passes).

Last reviewed commit: 3a26747

Comment on lines +374 to +385
should_signal=0
if [ -f "$SIGNAL_COUNTER_FILE" ]; then
counter=$(cat "$SIGNAL_COUNTER_FILE" 2>/dev/null || echo 0)
counter=$((counter + 1))
if [ "$counter" -ge "$SIGNAL_EVERY_N" ]; then
should_signal=1
counter=0
fi
done
echo "$counter" > "$SIGNAL_COUNTER_FILE"
else
echo "1" > "$SIGNAL_COUNTER_FILE"
fi

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Counter file has no locking — concurrent invocations can double-signal

The counter read-increment-write is not atomic. Claude Code can invoke hooks concurrently for parallel tool calls, meaning two observe.sh processes can:

  1. Both read the counter file at the same value (e.g., 19)
  2. Both increment to 20, both evaluate >= SIGNAL_EVERY_N as true
  3. Both set should_signal=1 and send kill -USR1 to the observer

This means two signals get sent simultaneously, and then two sequential on_usr1 calls fire. If the first on_usr1 completes quickly (e.g., hits an early-return branch) and resets ANALYZING=0 before the second one executes, two full analyses can be spawned — recreating the root cause of #521.

Consider protecting the counter update with flock (consistent with the existing locking pattern used above for the lazy-start logic):

if command -v flock >/dev/null 2>&1; then
  (
    flock -x 9
    counter=$(cat "$SIGNAL_COUNTER_FILE" 2>/dev/null || echo 0)
    counter=$((counter + 1))
    if [ "$counter" -ge "$SIGNAL_EVERY_N" ]; then
      should_signal=1
      counter=0
    fi
    echo "$counter" > "$SIGNAL_COUNTER_FILE"
  ) 9>"${SIGNAL_COUNTER_FILE}.lock"
fi

Comment on lines +163 to +167

ANALYZING=1
analyze_observations
LAST_ANALYSIS_EPOCH=$(date +%s)
ANALYZING=0

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LAST_ANALYSIS_EPOCH updated even when analyze_observations skips early

analyze_observations has several early-return paths (missing file, obs_count < MIN_OBSERVATIONS, session-guardian denying, claude not found). When these paths are taken, no analysis actually occurs — but LAST_ANALYSIS_EPOCH is still updated on line 166, starting the 60-second cooldown. This can delay legitimate analyses if, for example, the observer file doesn't exist yet during the first few signals.

Consider only updating LAST_ANALYSIS_EPOCH when analysis actually ran (e.g., after successfully spawning claude), or at least moving the update inside analyze_observations after the guards pass.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 3a26747b80

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines 165 to +167
analyze_observations
LAST_ANALYSIS_EPOCH=$(date +%s)
ANALYZING=0

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Update cooldown timestamp only after real analysis runs

on_usr1() always sets LAST_ANALYSIS_EPOCH after calling analyze_observations, even when that function exits early (for example when the file is missing or obs_count < MIN_OBSERVATIONS). In those no-op cases, subsequent SIGUSR1 events are still suppressed by cooldown, which can delay the first eligible analysis by at least ECC_OBSERVER_ANALYSIS_COOLDOWN seconds despite no Claude run having happened.

Useful? React with 👍 / 👎.

@cubic-dev-ai cubic-dev-ai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

4 issues found across 3 files

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="tests/hooks/observer-memory.test.js">

<violation number="1" location="tests/hooks/observer-memory.test.js:342">
P2: Use the same platform-aware Python detection in the fallback check; `python3`-only probing can incorrectly skip failures on Windows.</violation>
</file>

<file name="skills/continuous-learning-v2/hooks/observe.sh">

<violation number="1" location="skills/continuous-learning-v2/hooks/observe.sh:371">
P2: Validate `ECC_OBSERVER_SIGNAL_EVERY_N` before numeric comparison; non-integer values currently break throttle evaluation and can disable signaling.</violation>

<violation number="2" location="skills/continuous-learning-v2/hooks/observe.sh:376">
P2: The counter-file throttle is racy: concurrent observe.sh runs can emit multiple SIGUSR1 signals for the same threshold crossing, weakening the memory-loop protection.</violation>
</file>

<file name="skills/continuous-learning-v2/agents/observer-loop.sh">

<violation number="1" location="skills/continuous-learning-v2/agents/observer-loop.sh:166">
P2: Only update `LAST_ANALYSIS_EPOCH` when an analysis run actually starts/completes; updating it after no-op early returns incorrectly triggers cooldown and delays legitimate analyses.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

);
} else {
// If python3 is not available the hook exits early - that is acceptable
const hasPython = spawnSync('python3', ['--version']).status === 0;

@cubic-dev-ai cubic-dev-ai Bot Mar 16, 2026

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: Use the same platform-aware Python detection in the fallback check; python3-only probing can incorrectly skip failures on Windows.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At tests/hooks/observer-memory.test.js, line 342:

<comment>Use the same platform-aware Python detection in the fallback check; `python3`-only probing can incorrectly skip failures on Windows.</comment>

<file context>
@@ -0,0 +1,360 @@
+    );
+  } else {
+    // If python3 is not available the hook exits early - that is acceptable
+    const hasPython = spawnSync('python3', ['--version']).status === 0;
+    if (hasPython) {
+      assert.fail('Counter file should exist after running observe.sh');
</file context>
Fix with Cubic

# Throttle SIGUSR1: only signal observer every N observations (#521)
# This prevents rapid signaling when tool calls fire every second,
# which caused runaway parallel Claude analysis processes.
SIGNAL_EVERY_N="${ECC_OBSERVER_SIGNAL_EVERY_N:-20}"

@cubic-dev-ai cubic-dev-ai Bot Mar 16, 2026

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: Validate ECC_OBSERVER_SIGNAL_EVERY_N before numeric comparison; non-integer values currently break throttle evaluation and can disable signaling.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At skills/continuous-learning-v2/hooks/observe.sh, line 371:

<comment>Validate `ECC_OBSERVER_SIGNAL_EVERY_N` before numeric comparison; non-integer values currently break throttle evaluation and can disable signaling.</comment>

<file context>
@@ -365,24 +365,45 @@ if [ "$OBSERVER_ENABLED" = "true" ]; then
+# Throttle SIGUSR1: only signal observer every N observations (#521)
+# This prevents rapid signaling when tool calls fire every second,
+# which caused runaway parallel Claude analysis processes.
+SIGNAL_EVERY_N="${ECC_OBSERVER_SIGNAL_EVERY_N:-20}"
+SIGNAL_COUNTER_FILE="${PROJECT_DIR}/.observer-signal-counter"
+
</file context>
Suggested change
SIGNAL_EVERY_N="${ECC_OBSERVER_SIGNAL_EVERY_N:-20}"
SIGNAL_EVERY_N="${ECC_OBSERVER_SIGNAL_EVERY_N:-20}"
case "$SIGNAL_EVERY_N" in
''|*[!0-9]*|0) SIGNAL_EVERY_N=20 ;;
esac
Fix with Cubic


should_signal=0
if [ -f "$SIGNAL_COUNTER_FILE" ]; then
counter=$(cat "$SIGNAL_COUNTER_FILE" 2>/dev/null || echo 0)

@cubic-dev-ai cubic-dev-ai Bot Mar 16, 2026

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: The counter-file throttle is racy: concurrent observe.sh runs can emit multiple SIGUSR1 signals for the same threshold crossing, weakening the memory-loop protection.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At skills/continuous-learning-v2/hooks/observe.sh, line 376:

<comment>The counter-file throttle is racy: concurrent observe.sh runs can emit multiple SIGUSR1 signals for the same threshold crossing, weakening the memory-loop protection.</comment>

<file context>
@@ -365,24 +365,45 @@ if [ "$OBSERVER_ENABLED" = "true" ]; then
+
+should_signal=0
+if [ -f "$SIGNAL_COUNTER_FILE" ]; then
+  counter=$(cat "$SIGNAL_COUNTER_FILE" 2>/dev/null || echo 0)
+  counter=$((counter + 1))
+  if [ "$counter" -ge "$SIGNAL_EVERY_N" ]; then
</file context>
Fix with Cubic


ANALYZING=1
analyze_observations
LAST_ANALYSIS_EPOCH=$(date +%s)

@cubic-dev-ai cubic-dev-ai Bot Mar 16, 2026

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: Only update LAST_ANALYSIS_EPOCH when an analysis run actually starts/completes; updating it after no-op early returns incorrectly triggers cooldown and delays legitimate analyses.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At skills/continuous-learning-v2/agents/observer-loop.sh, line 166:

<comment>Only update `LAST_ANALYSIS_EPOCH` when an analysis run actually starts/completes; updating it after no-op early returns incorrectly triggers cooldown and delays legitimate analyses.</comment>

<file context>
@@ -130,7 +146,25 @@ on_usr1() {
+
+  ANALYZING=1
   analyze_observations
+  LAST_ANALYSIS_EPOCH=$(date +%s)
+  ANALYZING=0
 }
</file context>
Fix with Cubic

jacky99714 added a commit to jacky99714/everything-claude-code that referenced this pull request Mar 20, 2026
Merge upstream/main which includes:
- fix: observer memory explosion with throttling and tail sampling (affaan-m#536)
- fix: observer hooks hardening, secret scrubbing (affaan-m#348)
- fix: lazy-start observer logic (affaan-m#508)
- fix: read tool_response field in observe.sh (affaan-m#377)
- feat: add C++, Java, Rust, PyTorch language support
- feat: SQLite state store, skill evolution, session adapters
- feat: orchestration harness and selective install
- feat: DevFleet multi-agent orchestration skill

Conflict resolution:
- observe.sh: kept our hook_event_name + tool_response fixes,
  merged upstream's $PYTHON_CMD, secret scrubbing, lazy-start,
  throttled signaling, and session guards
- package.json: accepted upstream's new scripts and dependencies
@affaan-m affaan-m deleted the fix/observer-memory branch March 20, 2026 07:16
@coderabbitai coderabbitai Bot mentioned this pull request Apr 6, 2026
19 tasks
FrancescoRosciano pushed a commit to FRosciano-Mambo/everything-claude-code that referenced this pull request Jun 1, 2026
…d tail sampling (affaan-m#536)

Three fixes for the positive feedback loop causing runaway memory usage:

1. SIGUSR1 throttling in observe.sh: Signal observer only every 20
   observations (configurable via ECC_OBSERVER_SIGNAL_EVERY_N) instead
   of on every tool call. Uses a counter file to track invocations.

2. Re-entrancy guard in observer-loop.sh on_usr1(): ANALYZING flag
   prevents parallel Claude analysis processes from spawning when
   signals arrive while analysis is already running.

3. Cooldown + tail-based sampling in observer-loop.sh:
   - 60s cooldown between analyses (ECC_OBSERVER_ANALYSIS_COOLDOWN)
   - Only last 500 lines sent to LLM (ECC_OBSERVER_MAX_ANALYSIS_LINES)
     instead of the entire observations file

Closes affaan-m#521
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Memory Explosion in Continuous Learning v2 Observer

1 participant