Skip to content

Add slopwatch integration for reward hacking detection#8

Merged
Aaronontheweb merged 2 commits into
devfrom
feature/add-slopwatch-integration
Feb 21, 2026
Merged

Add slopwatch integration for reward hacking detection#8
Aaronontheweb merged 2 commits into
devfrom
feature/add-slopwatch-integration

Conversation

@Aaronontheweb

Copy link
Copy Markdown
Collaborator

Summary

  • Add slopwatch.cmd v0.3.3 as a local dotnet tool
  • Initialize baseline with 1 existing entry (CS1591 suppression in Directory.Build.props)
  • Add dotnet slopwatch analyze step to PR validation CI workflow (runs after tests, before pack)
  • Add slopwatch to AGENTS.md quality bar, definition of done, and post-code quality check section

Test plan

  • dotnet slopwatch analyze passes locally (0 new issues)
  • CI workflow runs slopwatch step successfully on PR

- Add slopwatch.cmd v0.3.3 as local dotnet tool
- Initialize baseline (1 entry: CS1591 suppression in Directory.Build.props)
- Add slopwatch analyze step to PR validation CI workflow
- Add slopwatch to quality bar and definition of done in AGENTS.md
Runs once on ubuntu-latest instead of inside the test matrix on both OS.
@Aaronontheweb Aaronontheweb merged commit a329184 into dev Feb 21, 2026
3 checks passed
@Aaronontheweb Aaronontheweb deleted the feature/add-slopwatch-integration branch February 21, 2026 19:44
Aaronontheweb added a commit to Aaronontheweb/netclaw that referenced this pull request Jun 3, 2026
… check

- doctor: inject DaemonConfig for the channel instead of hand-reading
  netclaw.json, so a non-string Daemon.UpdateChannel no longer crashes the
  whole `netclaw doctor` run (findings #1, netclaw-dev#4).
- update-check: drop the low-value 1h TTL and keep a plain last-result store
  for the daemon /status API (finding netclaw-dev#3). Make `channel` required (no default)
  on EvaluateManifest + CheckForUpdateAsync so a forgotten arg can't silently
  fall back to stable (finding netclaw-dev#9).
- SemVer: parse numeric prerelease identifiers as long, matching the bash
  generator's unbounded ints (finding netclaw-dev#6).
- /status: report FullVersion in the no-check-yet branch for consistency with
  the post-check branch (finding netclaw-dev#8).
- dotted prerelease convention (beta.N): update docs/examples + widen the
  property-test generators + add example cases; the release version gate now
  rejects mixed identifiers like `beta1` so a non-dotted tag can't ship and
  silently mis-order the channel (finding #2).
- conformance: extract the generator's precedence key to
  feeds/scripts/semver_key.py and assert BOTH it and the C# SemVer comparator
  order one shared fixture (feeds/scripts/semver-order.txt) — via a C# test and
  a CI check — so the two implementations can't drift (finding netclaw-dev#7).

Also updates the release-channels OpenSpec change (dotted-tag requirement plus
risk/decision notes).
Aaronontheweb added a commit that referenced this pull request Jun 3, 2026
* feat(update): channel-aware, semver-correct update check (#1027)

Make the binary update check honor an opt-in beta channel and compare
versions by SemVer 2.0.0 precedence, so beta testers are notified of the
next prerelease while stable users are never offered one.

- BinaryFeedManifest: add `latestPrerelease` (newest of {stable, prerelease}).
- SemVer: self-contained 2.0.0 precedence comparator (no NuGet.Versioning),
  matching the bash manifest generator's rules; IsNewerVersion uses it instead
  of System.Version (which couldn't parse a prerelease suffix at all).
- BuildInfo.FullVersion: read the assembly informational version (keeps
  `-beta1`) so a beta build doesn't report its stripped core and strand.
- DaemonConfig.UpdateChannel (stable default | beta) + config schema; parse
  fails loudly on an unknown value.
- EvaluateManifest/CheckForUpdateAsync are channel-aware: stable reads only
  `latest`; beta reads `latestPrerelease` and rolls onto a superseding stable.
- Thread channel + FullVersion through the daemon check, `netclaw update`,
  the startup notice, `netclaw status`, and `netclaw doctor`.

Stable clients structurally never read `latestPrerelease`. The check stays
advisory-only; Daemon.DisableSelfUpdate still blocks in-place update.

Tests: SemVer precedence, channel-aware evaluation, ParseUpdateChannel.

* docs(openspec): add release-channels capability spec

Capture the beta (prerelease) release-channel capability end-to-end:
manifest pointer semantics, prerelease-aware publishing, installer/Docker
channel selection, and the channel-aware update-check policy. Documents
both PR #1314 (merged) and the update-check work in this PR.

Key invariant specified: a stable client is never offered a prerelease.

* test(semver): add CsCheck property-based tests for SemVer

Generate thousands of random valid SemVers and assert the comparator's
correctness laws: parse-totality, antisymmetry, transitivity, IsNewer/compare
consistency, build-metadata invariance, stable-outranks-prerelease, and
numeric-below-alphanumeric precedence. Complements the fixed-example
SemVerTests with algebraic coverage over a large random space.

Adds CsCheck 4.7.0 (test-only) via central package management.

* fix(update): address code-review findings on the channel-aware update check

- doctor: inject DaemonConfig for the channel instead of hand-reading
  netclaw.json, so a non-string Daemon.UpdateChannel no longer crashes the
  whole `netclaw doctor` run (findings #1, #4).
- update-check: drop the low-value 1h TTL and keep a plain last-result store
  for the daemon /status API (finding #3). Make `channel` required (no default)
  on EvaluateManifest + CheckForUpdateAsync so a forgotten arg can't silently
  fall back to stable (finding #9).
- SemVer: parse numeric prerelease identifiers as long, matching the bash
  generator's unbounded ints (finding #6).
- /status: report FullVersion in the no-check-yet branch for consistency with
  the post-check branch (finding #8).
- dotted prerelease convention (beta.N): update docs/examples + widen the
  property-test generators + add example cases; the release version gate now
  rejects mixed identifiers like `beta1` so a non-dotted tag can't ship and
  silently mis-order the channel (finding #2).
- conformance: extract the generator's precedence key to
  feeds/scripts/semver_key.py and assert BOTH it and the C# SemVer comparator
  order one shared fixture (feeds/scripts/semver-order.txt) — via a C# test and
  a CI check — so the two implementations can't drift (finding #7).

Also updates the release-channels OpenSpec change (dotted-tag requirement plus
risk/decision notes).
Aaronontheweb added a commit to Aaronontheweb/netclaw that referenced this pull request Jun 15, 2026
Resolves the 11 findings from the /code-review pass:

#1 Multi-line secret redaction: per-line redaction in JobOutputLog misses
   secrets spanning lines (e.g. PEM blocks). Re-redact the assembled tail at
   every LLM-surface point (execution-actor completion, manager HandleQuery,
   NotifyLostJob) so multi-line secrets can't reach the model.
#2 Journaled reap event (SessionBackgroundJobsReaped): reap marks were
   snapshot-only and lost on recovery when the passivation snapshot is skipped
   (parked approval), rehydrating killed jobs as 'running'. FinishJobReap now
   persists the reap; recovery replays it. Full serializer plumbing + round-trip test.
netclaw-dev#3 Dispose the Process in BackgroundJobExecutionActor.PostStop — stops the
   kernel handle / wait-handle leak (amplified by the no-default-timeout).
netclaw-dev#4 Audience-gate the [active-background-jobs] block (commands, rationales, and
   the output-log path) for Public, matching WorkingContext.
netclaw-dev#5 JobOutputLog.ReadTail falls back to the rotated .1 file when the current log
   is momentarily absent mid-rotation, instead of returning an empty tail.
netclaw-dev#6 A transient File.Move failure in Rotate() is non-fatal: capture continues on
   the current log and retries next threshold, rather than permanently going silent.
netclaw-dev#7 Back WriteFailure with a volatile field (un-gated fast-path read crosses threads).
netclaw-dev#8 Correlate reap Ask replies with an epoch so a late reply from a superseded
   passivation can't resolve a newer handshake.
netclaw-dev#10 Centralize the reap-reply handler (CommandJobReapResolved) across all
    non-terminal phases so a future phase can't silently drop the reply.
netclaw-dev#11 Apply(TurnRecorded) now delegates job dedup/prune to the single shared
    CompleteTurnBackgroundJobBookkeeping helper so replay and live paths can't drift.
netclaw-dev#9 AutoFlush is kept (live monitoring requires per-line visibility; a write() to
   the page cache is cheap and a time-throttle risks an unflushed quiescent
   ready-line) — documented as a deliberate decision.

Tests: +6 (reaped-event round-trip, ReadTail rotation fallback + rethrow,
SessionBackgroundJobsReaped apply, Public/Personal active-jobs gating); updated
RotationFailure test to the new non-fatal contract. Full Actors suite 2412 green
x2; slopwatch + headers clean.
Aaronontheweb added a commit that referenced this pull request Jun 15, 2026
… kill timer, reap on passivation (#1405)

* Background jobs as detached processes: stream logs live, no default kill timer, reap on passivation, Lost notifications

A background job is now a detached process with no expectation of completion
(OpenSpec: background-jobs-detached-process-redesign). Fixes the hung-session
class where a dev server (jekyll serve / npm run dev) could never be used:
both execution paths blocked on process exit.

- Stream stdout/stderr to ~/.netclaw/jobs/{id}/output.log while the process
  runs (per-line secret redaction, 5MB single-slot rotation). The existing
  check_background_job tail query and file_read/grep monitoring now work
  mid-run; output survives daemon crashes. Completion tails read from disk.
- Remove the silent default kill timer on background routing: omitted
  _timeout_seconds now means no timer (was: synchronous default, killing
  un-hinted jobs early). Submit ACK includes the output log path.
- Reap on session passivation: KillJobsForSession handshake before the final
  snapshot; new Reaped status (distinct from Cancelled); no turn delivery on
  reap (would rehydrate the session being torn down); reaped entries surface
  exactly once in [active-background-jobs] on rehydration, then prune.
- Wire up session-side job tracking (TrackBackgroundJob had no production
  caller — the active-jobs context block was always empty).
- Daemon-restart reconciliation now delivers Lost notifications to owning
  sessions with the pre-crash log path.
- Remove the vestigial pending-approval passivation deferral: approvals are
  journaled and the response path already rehydrates and resumes.
- AGENTS.md template, netclaw-operations SKILL.md (v2.13.0), and the
  background-jobs runbook document the new lifecycle; eval suite gains a
  background-job lifecycle regression case.

* Fix background-job lifecycle eval: multi-turn harness, pre-trusted verb, tightened assertion

The new tool_background_job_lifecycle case scored 0/5 for instrumentation
reasons, not model behavior (per the eval-debugging guidance):

1. run_case treats multiple prompts as alternate phrasings (pick_variant)
   — sequential conversations need run_multi_turn_case, which resumes one
   session and accumulates stdout across turns for the assertion.
2. Even then, every background submission died at the approval gate: the
   headless eval container has no approval requester and 'sleep' is not on
   the safe-command allowlist. Passing runs were vacuous (the model probed
   check_background_job with a made-up ID while flailing). The eval setup
   now pre-trusts the sleep verb via 'netclaw approvals trust-verb' against
   the bind-mounted tool-approvals.json before the container starts, so the
   case exercises the real lifecycle: submit -> job id -> status -> cancel.
3. The assertion now requires the actual _background":true submission,
   not just any shell_execute call.

Result: 5/5, with transcripts showing the genuine flow (job id returned,
ACK steering to the streaming log path, live status with elapsed time,
cancel confirmed).

* Fix CI: SW003 empty-catch marker, parallel-test isolation for real-process job tests

Two PR CI failures:

1. Slopwatch SW003 — the write-failure path in JobOutputLog had an empty
   inner catch with the rationale as a body comment instead of the repo's
   'catch // slopwatch-ignore: SW003 <reason>' marker convention. (Passed
   locally because slopwatch 0.4.1 only scans the git diff vs local HEAD;
   CI's PR-merge scans the whole new file.)

2. Test-ubuntu-latest flake — KillJobsForSession_ReapsOwnedJobs and
   BackgroundJob_Completes_And_DeliversResult_ViaGateway intermittently
   failed with the owning manager's freshly-created jobs showing 'Lost'.
   Root cause (reproduced reliably by running the Jobs test classes
   together): under heavy parallel load, concurrent process/FS pressure
   makes a manager's message handler throw transiently, the actor restarts,
   and startup reconciliation correctly marks its in-flight jobs Lost — a
   spurious restart to induce in a unit test. Fix: serialize the three
   real-process-spawning job test classes via a DisableParallelization
   collection (repo's established pattern) so they don't mutually starve.
   Verified: full assembly 4/4 green, the prior ~Jobs repro 3/3 green.

Also register TimeProvider in LlmSessionTestBase to mirror production DI
(Daemon Program.cs) — WithNetclawActors() constructs the background-job and
reminder managers via the DI resolver, which need it; without it they died
with ActorInitializationException at startup, adding restart churn.

* Address code-review findings on background-jobs feature

Resolves the 11 findings from the /code-review pass:

#1 Multi-line secret redaction: per-line redaction in JobOutputLog misses
   secrets spanning lines (e.g. PEM blocks). Re-redact the assembled tail at
   every LLM-surface point (execution-actor completion, manager HandleQuery,
   NotifyLostJob) so multi-line secrets can't reach the model.
#2 Journaled reap event (SessionBackgroundJobsReaped): reap marks were
   snapshot-only and lost on recovery when the passivation snapshot is skipped
   (parked approval), rehydrating killed jobs as 'running'. FinishJobReap now
   persists the reap; recovery replays it. Full serializer plumbing + round-trip test.
#3 Dispose the Process in BackgroundJobExecutionActor.PostStop — stops the
   kernel handle / wait-handle leak (amplified by the no-default-timeout).
#4 Audience-gate the [active-background-jobs] block (commands, rationales, and
   the output-log path) for Public, matching WorkingContext.
#5 JobOutputLog.ReadTail falls back to the rotated .1 file when the current log
   is momentarily absent mid-rotation, instead of returning an empty tail.
#6 A transient File.Move failure in Rotate() is non-fatal: capture continues on
   the current log and retries next threshold, rather than permanently going silent.
#7 Back WriteFailure with a volatile field (un-gated fast-path read crosses threads).
#8 Correlate reap Ask replies with an epoch so a late reply from a superseded
   passivation can't resolve a newer handshake.
#10 Centralize the reap-reply handler (CommandJobReapResolved) across all
    non-terminal phases so a future phase can't silently drop the reply.
#11 Apply(TurnRecorded) now delegates job dedup/prune to the single shared
    CompleteTurnBackgroundJobBookkeeping helper so replay and live paths can't drift.
#9 AutoFlush is kept (live monitoring requires per-line visibility; a write() to
   the page cache is cheap and a time-throttle risks an unflushed quiescent
   ready-line) — documented as a deliberate decision.

Tests: +6 (reaped-event round-trip, ReadTail rotation fallback + rethrow,
SessionBackgroundJobsReaped apply, Public/Personal active-jobs gating); updated
RotationFailure test to the new non-fatal contract. Full Actors suite 2412 green
x2; slopwatch + headers clean.

* Fix racy ReminderManagerActorTests.Startup_emits_alert_for_legacy_reminder_missing_trust_fields

Root cause (per akka-net + dotnet-concurrency analysis): the legacy-schema
alert is emitted synchronously inside the actor's PreStart, and the test waited
for it with a fixed 5s AwaitAssertAsync poll. Under heavy parallel CI load the
shared ThreadPool is saturated (many TestKit ActorSystems, WithSerializationVerification
overhead), so the actor's PreStart can be scheduled later than the 5s budget and
the poll gives up with an empty sink. Not a logic/visibility bug — the sink is
lock-guarded and the store records the rejection synchronously in its constructor.

Fix: await a deterministic readiness signal instead of polling a wall clock. An
actor processes mailbox messages only after PreStart completes, so a successful
Ask<ReminderHealthResponse>(GetReminderHealthQuery) reply guarantees the emit has
run. This is the same readiness pattern already used elsewhere in this test file;
the generous Ask timeout absorbs scheduling latency and returns as soon as the
actor is ready (no wasted time in the common case).

No existing GitHub issue covers this test. Does not reproduce locally even at
full-assembly parallelism (CI-runner-only starvation).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant