Skip to content

Add /goal session continuation command#85723

Closed
PollyBot13 wants to merge 20 commits into
openclaw:mainfrom
PollyBot13:goal-command-minimal
Closed

Add /goal session continuation command#85723
PollyBot13 wants to merge 20 commits into
openclaw:mainfrom
PollyBot13:goal-command-minimal

Conversation

@PollyBot13

@PollyBot13 PollyBot13 commented May 23, 2026

Copy link
Copy Markdown

Summary

  • add a bundled goal extension with /goal help, /goal start, /goal status, /goal events, /goal pause, /goal resume, /goal done, and /goal clear
  • add a goal_status tool so the active session can report continue, done, blocked, paused, or waiting_approval
  • store one visible goal state per trusted session
  • add and use the continuation lease workflow API to enqueue one same-session follow-up turn
  • add stop gates, terminal states, and a continuation cap

Scope

This PR now carries both the session-scoped continuation lease workflow API and the bundled /goal user flow that exercises it. The earlier split-out scaffold PR (#85722) was closed, so this branch targets main directly rather than depending on a stacked prerequisite.

AI assistance and review transparency

  • AI-assisted: yes; this PR was developed with PollyBot/Codex assistance under human direction and local verification.
  • Human understanding: I understand the goal plugin state machine, continuation lease contract, trusted tool-context binding, stop gates, and current proof limitations well enough to explain and maintain the change.
  • Prompts/session logs: the useful review context is captured in this PR body and PR comments rather than pasted raw, because the working session contains private local/channel identifiers that are not appropriate for the public PR.
  • Local Codex review: attempted locally, but the Codex CLI currently fails with OpenAI 401 Unauthorized; I replaced that with manual diff/log review and focused local validation instead of claiming a Codex review pass.
  • Bot conversations: author-owned ClawSweeper findings addressed so far were fixed or folded into the current review state; maintainer-owned SDK/product/proof decisions remain explicit below.

Why

The continuation lease API is a runtime primitive, and that is hard to judge without a real user flow. /goal is that user flow: a human starts an objective, inspects what happened, pauses or resumes continuation, and finishes or clears the goal while the model can request only bounded same-session continuation through goal_status.

The user experience is not only "let the agent continue." It is also "let the human see whether the agent is behaving well enough to keep continuing." That matters across model tiers: a frontier model may stay on target most of the time, but cheaper or smaller models need tighter rails, clearer stop states, and a visible decision trail so users can trust, pause, resume, or stop work without guessing.

This is why the command surface includes both lifecycle controls and lightweight observability. /goal status shows the current state; /goal events shows the recent decision trail. The user does not need a dashboard or debug mode to answer the basic question: "is this still doing the right thing?"

Safety

  • slash-command session context is host-owned
  • goal_status uses trusted tool context, not model-supplied sessionKey or goalId
  • model-driven continue is accepted only while the current goal is already in continue
  • done and blocked are terminal for /goal resume; start a new goal to continue
  • paused stops continuation until a human resumes
  • clear removes the active user-facing goal state and clears any matching continuation lease
  • continuation count is capped; hitting the cap moves the goal to waiting_approval, clears the lease, and tells the user to start a new goal
  • no cross-session targeting, no fanout, no silent wake modes, no sub-agent continuation

Commands

/goal help
/goal start <objective>
/goal status
/goal events [n]
/goal pause
/goal resume
/goal done [note]
/goal clear [note]

Tests

pnpm exec vitest run extensions/goal/index.test.ts
pnpm test:extensions:batch -- goal
pnpm tsgo:extensions:test
pnpm exec oxlint extensions/goal/index.ts extensions/goal/index.test.ts \
  extensions/goal/src/command.ts extensions/goal/src/state.ts \
  extensions/goal/src/tool.ts extensions/goal/src/workflow.ts
pnpm build

Plugin guardrails also passed:

pnpm exec vitest run --config test/vitest/vitest.contracts-plugin.config.ts \
  src/plugins/contracts/package-manifest.contract.test.ts \
  src/plugins/contracts/plugin-entry-guardrails.test.ts \
  src/plugins/contracts/plugin-tool-contracts.test.ts

Real behavior proof (required for external PRs)

  • Behavior or issue addressed: Real bounded same-session goal continuation for the bundled /goal command: start a goal, schedule a continuation lease, resume the same session, visibly announce continuation output, finish as done, and clear active goal state.

  • Real environment tested: Isolated local canary Gateway profile using the PR checkout, bundled codex, discord, and goal plugins, separate canary Discord bot, loopback Gateway, and a private Discord proof channel. Private channel/session IDs are redacted from this PR body.

  • Exact steps or command run after this patch: In the canary Discord proof channel, run /goal start <objective>, allow the scheduled continuation lease to fire, let the resumed session complete, then inspect the canary goal proof artifact, canary cron run records, active goal session directory, and visible canary bot messages in the proof channel.

  • Evidence after fix (screenshot, recording, terminal capture, console output, redacted runtime log, linked artifact, or copied live output): Copied redacted runtime evidence from the isolated canary profile and private Discord proof channel on pushed head 41ecd3a38da1d64343110cde8926b0dd46009929:

    canary profile: ~/.openclaw-canary
    canary Gateway: 127.0.0.1:18799 loopback-only isolated Gateway
    canary bot: separate Discord canary bot
    PR head: 51ad5069b00da5e4a560b2d03f89b7ce01f86b1b
    visible Discord result: goal completed
    visible continuation count: 1
    visible event log: created continue -> lease_scheduled continue -> status done
    scheduler path: Cron-backed same-session continuation lease
    delivery target: originating private Discord proof channel
    proof media: still snapshot retained for review; full screen recording available on request
    
  • Observed result after fix: The canary run created a real session goal, scheduled a real continuation lease, resumed the same Discord session, displayed the continuation count and event log, reached final status done, and preserved visible proof of the scheduler-backed continuation path.

  • What was not tested: Public Discord identifiers and the raw screen recording are not embedded in this PR body because the proof channel is private. The full recording is available on request.

  • Proof limitations or environment constraints: The Mantis Telegram proof still did not exercise product behavior because Crabbox could not start a Telegram Desktop lease, so baseline/candidate Telegram capture was skipped. The Discord canary proof above covers the /goal continuation behavior; later current-head commits only addressed unrelated CI/harness/lint drift and did not change the /goal runtime path.

Current validation

Local/current-head validation for the source fixes included:

node scripts/run-vitest.mjs src/plugins/contracts/scheduled-turns.contract.test.ts src/plugins/loader.test.ts extensions/goal/index.test.ts
pnpm check:test-types
pnpm tsgo:prod
pnpm tool-display:check
pnpm run lint:extensions:bundled
pnpm run lint:extensions:channels
pnpm run test:extensions:package-boundary:compile
pnpm run test:extensions:package-boundary:canary
NPM_CONFIG_BEFORE= NPM_CONFIG_MIN_RELEASE_AGE=0 pnpm deps:shrinkwrap:check
node scripts/build-all.mjs
git diff --check

GitHub current-head readback on pushed head b61b6d0499c46c4e3c5f81c238d67a29733fb6f9 is pending after rebasing onto current main 8f6a2f0f6b119e8eb3e63d53800207fabe78e735. Local targeted validation passed: node scripts/run-vitest.mjs src/cron/isolated-agent/run.message-tool-policy.test.ts src/agents/embedded-agent-runner/run/attempt.test.ts src/agents/embedded-agent-runner/run/attempt-tool-construction-plan.test.ts src/plugins/contracts/scheduled-turns.contract.test.ts src/plugins/loader.test.ts extensions/goal/index.test.ts src/plugins/channel-plugin-ids.test.ts test/scripts/openclaw-e2e-instance.test.ts; pnpm plugin-sdk:api:check; and git diff --check.

@PollyBot13 PollyBot13 requested a review from a team as a code owner May 23, 2026 13:47
@github-actions github-actions Bot added the dependencies-changed PR changes dependency-related files label May 23, 2026
@github-actions

github-actions Bot commented May 23, 2026

Copy link
Copy Markdown
Contributor

Dependency Changes Detected

This PR changes dependency-related files. Maintainers should confirm these changes are intentional.

Changed files:

  • extensions/goal/package.json
  • pnpm-lock.yaml

Maintainer follow-up:

  • Review whether the dependency changes are intentional.
  • Inspect resolved package deltas when lockfile, shrinkwrap, or workspace dependency policy changes are present.
  • Treat package-lock.json and npm-shrinkwrap.json diffs as security-review surfaces.
  • Run pnpm deps:changes:report -- --base-ref origin/main --markdown /tmp/dependency-changes.md --json /tmp/dependency-changes.json locally for detailed release-style evidence.

@openclaw-barnacle openclaw-barnacle Bot added size: XL triage: needs-real-behavior-proof Candidate: external PR needs after-fix proof from a real setup. labels May 23, 2026
@openclaw-barnacle openclaw-barnacle Bot added proof: supplied External PR includes structured after-fix real behavior proof. and removed triage: needs-real-behavior-proof Candidate: external PR needs after-fix proof from a real setup. labels May 23, 2026
@PollyBot13 PollyBot13 force-pushed the goal-command-minimal branch from 1e24225 to b6e47e8 Compare May 23, 2026 13:53
@clawsweeper

clawsweeper Bot commented May 23, 2026

Copy link
Copy Markdown
Contributor

Codex review: needs maintainer review before merge. Reviewed May 27, 2026, 2:16 PM ET / 18:16 UTC.

Summary
The PR adds a bundled goal plugin with /goal commands and a goal_status tool, extends Plugin SDK/session workflow APIs for continuation leases, wires Cron-backed same-session scheduling/delivery, and updates docs, tests, labeler, scripts, and lock metadata.

PR surface: Source +1036, Tests +1128, Docs +175, Config +38, Generated 0, Other +17. Total +2394 across 41 files.

Reproducibility: not applicable. this is a new user-facing feature and SDK capability, not a report of broken current-main behavior. Current-main search shows the requested /goal and continuation-lease behavior is not already present.

Review metrics: 3 noteworthy metrics.

  • Plugin SDK workflow surface: 2 methods added. New public plugin workflow methods are compatibility-sensitive and need maintainer acceptance before merge.
  • Dependency resolution: 1 plugin-local dependency declaration; 0 resolved package versions changed. The lockfile adds a new importer for existing typebox 1.1.38 rather than a new resolved dependency graph.
  • Bundled plugin activation surface: 1 bundled command plugin added, explicitly enabled by users. The command surface is user-visible but remains gated by plugin enablement and restrictive allowlist configuration.

Merge readiness
Overall: 🐚 platinum hermit
Proof: 🐚 platinum hermit
Patch quality: 🐚 platinum hermit
Result: ready for maintainer review.

Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch.

Rank-up moves:

  • Confirm maintainer sign-off for the new Plugin SDK continuation-lease contract and bundled /goal product surface.
  • Retry Telegram or equivalent non-Discord live proof if maintainers want transport-specific confidence before merge.

Mantis proof suggestion
A short native Telegram proof would materially reduce cross-transport delivery uncertainty for the new visible goal command and continuation flow. A maintainer can ask Mantis to capture proof by posting a new PR comment that starts with the OpenClaw Mantis account mention, followed by:

telegram desktop proof: verify /goal start schedules one same-session continuation, /goal status/events show the trail, and /goal pause stops further continuation.

Risk before merge

  • Merging creates a new public Plugin SDK workflow API; maintainers should explicitly accept the method names, result shapes, bundled-only behavior, and deprecation posture before the contract ships.
  • The goal state machine and continuation leases affect session state: cleanup failure, caps, pause/resume, and terminal states must be acceptable because a mistake can leave a session continuing or waiting unexpectedly.
  • Announcement delivery reconstructs Discord and non-Discord targets from session keys; source tests cover the path and Discord real proof is supplied, but Telegram/Desktop proof did not run because the earlier lease setup failed outside product code.

Maintainer options:

  1. Accept With Maintainer API Sign-Off (recommended)
    If maintainers want /goal bundled in core, approve the SDK continuation-lease contract and accept the documented session/message-delivery risk before merge.
  2. Request One More Transport Proof
    Before merge, ask for a Telegram or equivalent non-Discord live proof showing /goal start, continuation output, status/events, and pause/stop behavior in a real chat.
  3. Pause If This Should Stay Experimental
    If the continuation primitive is not ready to become a core Plugin SDK contract, pause or close this PR and continue the idea as an external/experimental plugin path.

Next step before merge
No narrow automated repair remains; the next action is maintainer review of the SDK/product decision and explicit acceptance or reduction of the merge risks.

Security
Cleared: The diff adds a plugin-local dependency declaration for an already-resolved package plus SDK/runtime/docs changes; I found no concrete secret, permission, lifecycle-script, package-resolution, or third-party execution regression.

Review details

Best possible solution:

Land only after maintainer sign-off on the bundled product surface and SDK contract, with the compatibility and transport risks either explicitly accepted or reduced by an additional live transport proof.

Do we have a high-confidence way to reproduce the issue?

Not applicable: this is a new user-facing feature and SDK capability, not a report of broken current-main behavior. Current-main search shows the requested /goal and continuation-lease behavior is not already present.

Is this the best way to solve the issue?

Unclear pending maintainer judgment: the implementation is a coherent bundled plugin plus generic continuation lease API, but whether this is the right core product/API surface must be accepted by maintainers before merge.

AGENTS.md: found and applied where relevant.

Codex review notes: model gpt-5.5, reasoning high; reviewed against c0f16460d748.

Label changes

Label changes:

  • add proof: sufficient: Contributor real behavior proof is sufficient. The PR body includes copied redacted live output from an isolated Discord canary run showing a real goal start, scheduled continuation lease, same-session resume, visible event log, final done, and cleanup on the PR branch.

Label justifications:

  • P2: This is a normal-priority feature/API PR with meaningful but bounded session and delivery impact.
  • merge-risk: 🚨 compatibility: The PR adds new Plugin SDK workflow methods and bundled plugin behavior that become compatibility-sensitive once shipped.
  • merge-risk: 🚨 session-state: The change persists one goal state per trusted session and schedules/clears continuation leases based on pause, done, blocked, and cap transitions.
  • merge-risk: 🚨 message-delivery: Continuation output is delivered back through channel targets reconstructed from session keys, including non-Discord channel forms.
  • rating: 🐚 platinum hermit: Overall readiness is 🐚 platinum hermit; proof is 🐚 platinum hermit and patch quality is 🐚 platinum hermit.
  • feature: ✨ showcase: ClawSweeper spotlight: unusually compelling feature idea for maintainer attention. A bounded, visible same-session continuation flow with human pause/status/event controls is a notably useful workflow unlock if maintainers want this capability in core.
  • status: 👀 ready for maintainer look: ClawSweeper has no concrete contributor-facing blocker left for this PR. Sufficient (live_output): The PR body includes copied redacted live output from an isolated Discord canary run showing a real goal start, scheduled continuation lease, same-session resume, visible event log, final done, and cleanup on the PR branch.
  • proof: sufficient: Contributor real behavior proof is sufficient. The PR body includes copied redacted live output from an isolated Discord canary run showing a real goal start, scheduled continuation lease, same-session resume, visible event log, final done, and cleanup on the PR branch.
  • mantis: telegram-visible-proof: Mantis should capture Telegram visible proof. The PR adds user-visible /goal chat behavior and continuation output that can be demonstrated in Telegram Desktop, and the scheduler path includes Telegram target parsing.
Evidence reviewed

PR surface:

Source +1036, Tests +1128, Docs +175, Config +38, Generated 0, Other +17. Total +2394 across 41 files.

View PR surface stats
Area Files Added Removed Net
Source 20 1046 10 +1036
Tests 8 1130 2 +1128
Docs 6 206 31 +175
Config 3 38 0 +38
Generated 1 2 2 0
Other 3 20 3 +17
Total 41 2442 48 +2394

What I checked:

  • Repository policy applied: Root policy says optional integrations normally live outside core, while missing core/plugin APIs and bundled regressions may stay; it also marks plugin APIs, session state, setup/startup, fallback, and delivery changes as compatibility-sensitive merge risk. (AGENTS.md:25, c0f16460d748)
  • Current main does not already implement the feature: A current-main search found only incidental generic “Goals” headings and no goal_status, continuation lease API, or /goal runtime implementation, so this PR is not obsolete on main. (c0f16460d748)
  • Bundled plugin entry: The PR adds a bundled goal plugin entry that registers the /goal command and goal_status tool. (extensions/goal/index.ts:5, 41ecd3a38da1)
  • Public SDK workflow surface: The PR adds requestSessionContinuationLease and clearSessionContinuationLease to the plugin session workflow API, which is a public plugin contract surface. (src/plugins/types.ts:2545, 41ecd3a38da1)
  • Scheduler-backed continuation implementation: The continuation lease implementation validates the trusted session, replaces existing lease jobs, and schedules a bounded one-shot Cron-backed session turn. (src/plugins/host-hook-scheduled-turns.ts:525, 41ecd3a38da1)
  • Delivery target parsing: Announced continuation delivery reconstructs channel targets from the session key, including Discord and non-Discord forms, which is why message-delivery review remains relevant. (src/plugins/host-hook-scheduled-turns.ts:85, 41ecd3a38da1)

Likely related people:

  • Vincent Koc: Current-main blame and log for the central plugin scheduler, registry, SDK type, and docs surfaces point to commit a2f714c in this area. (role: recent area contributor; confidence: medium; commits: a2f714cd440f; files: src/plugins/host-hook-scheduled-turns.ts, src/plugins/registry.ts, src/plugins/types.ts)
  • scoootscooob: Git history shows repeated merged work on plugin SDK, bundled plugin, and channel runtime boundaries that are adjacent to this PR's API and delivery changes. (role: adjacent owner; confidence: medium; commits: 94ef2f1b0d78, 01d3442246f3, d6367c2c55a2; files: src/plugins, src/plugin-sdk, extensions)
What the crustacean ranks mean
  • 🦀 challenger crab: rare, exceptional readiness with strong proof, clean implementation, and convincing validation.
  • 🦞 diamond lobster: very strong readiness with only minor maintainer review expected.
  • 🐚 platinum hermit: good normal PR, likely mergeable with ordinary maintainer review.
  • 🦐 gold shrimp: useful signal, but proof or patch confidence is still limited.
  • 🦪 silver shellfish: thin signal; proof, validation, or implementation needs work.
  • 🧂 unranked krab: not merge-ready because proof is missing/unusable or there are serious correctness or safety concerns.
  • 🌊 off-meta tidepool: rating does not apply to this item.

Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics.

How this review workflow works
  • ClawSweeper keeps one durable marker-backed review comment per issue or PR.
  • Re-runs edit this comment so the latest verdict, findings, and automation markers stay together instead of adding duplicate bot comments.
  • A fresh review can be triggered by eligible @clawsweeper re-review comments, exact-item GitHub events, scheduled/background review runs, or manual workflow dispatch.
  • PR/issue authors and users with repository write access can comment @clawsweeper re-review or @clawsweeper re-run on an open PR or issue to request a fresh review only.
  • Maintainers can also comment @clawsweeper review to request a fresh review only.
  • Fresh-review commands do not start repair, autofix, rebase, CI repair, or automerge.
  • Maintainer-only repair and merge flows require explicit commands such as @clawsweeper autofix, @clawsweeper automerge, @clawsweeper fix ci, or @clawsweeper address review.
  • Maintainers can comment @clawsweeper explain to ask for more context, or @clawsweeper stop to stop active automation.

@clawsweeper clawsweeper Bot added rating: 🧂 unranked krab Not merge-ready due to missing proof or serious correctness/safety concerns. status: 📣 needs proof The PR needs real behavior proof before ClawSweeper can clear the contributor ask. P2 Normal backlog priority with limited blast radius. merge-risk: 🚨 compatibility 🚨 May break existing users, config, migrations, defaults, or upgrade paths. merge-risk: 🚨 auth-provider 🚨 May break OAuth, tokens, provider routing, model choice, or credentials. labels May 23, 2026
@clawsweeper

clawsweeper Bot commented May 23, 2026

Copy link
Copy Markdown
Contributor

ClawSweeper PR egg

✨ Hatched: 🥚 common Cosmic Test Hopper

Hatch command

Comment @clawsweeper hatch when this PR is hatchable.

Hatchability rules:

  • Merged PRs are hatchable.
  • Open PRs are hatchable when they are status: 👀 ready for maintainer look, status: 🚀 automerge armed, or labeled clawsweeper:automerge.
  • Closed unmerged PRs are hatchable only when one of those hatchable labels is still present in the durable record.

Rarity: 🥚 common.
Trait: polishes edge cases.
Image traits: location release reef; accessory little merge flag; palette violet, aqua, and starlight; mood celebratory; pose peeking out from the egg shell; shell polished stone shell; lighting subtle sparkle highlights; background quiet workflow signs.
Share on X: post this hatch
Copy: My PR egg hatched a 🥚 common Cosmic Test Hopper in ClawSweeper.

What is this egg doing here?
  • Eggs appear after the PR passes real-behavior proof. It is here for vibes, not verdicts: it does not change labels, ratings, merge decisions, or automation.
  • The shell reacts to review momentum: open follow-up work warms it up, re-review makes it wobble, and a clean final review lets it hatch.
  • Hatchability usually comes from sufficient real-behavior proof, no blocking P0/P1/P2 findings, no security attention needed, and clean correctness. A merged PR is already final, so merge makes the egg hatchable independently.
  • The hatch is seeded from this repository and PR number, so the same PR keeps the same creature; the reviewed head SHA can only change safe visual details.
  • Rarity is just collectible sparkle: 🥚 common, 🌱 uncommon, 💎 rare, ✨ glimmer, and 🌈 legendary.

@PollyBot13 PollyBot13 force-pushed the goal-command-minimal branch from b6e47e8 to 3abbf4a Compare May 23, 2026 14:50
@PollyBot13

Copy link
Copy Markdown
Author

Addressed the two concrete ClawSweeper code findings in head 3abbf4a99965f610e99ef94c96fb794a3fc95bed:

  • goal_status.status now uses a flat string enum schema via Type.Unsafe({ type: "string", enum: [...] }), avoiding the anyOf shape from Type.Union(...Type.Literal...).
  • createTestPluginApi().session.workflow.requestSessionContinuationLease() now returns the declared continuation result shape: { scheduled: false, reason: "plugin_not_loaded" }.

Focused verification after the patch:

pnpm exec vitest run extensions/goal/index.test.ts \
  src/plugins/contracts/scheduled-turns.contract.test.ts \
  src/plugins/contracts/plugin-tool-contracts.test.ts \
  src/plugins/contracts/plugin-entry-guardrails.test.ts \
  src/plugins/contracts/extension-package-project-boundaries.test.ts
# 5 files, 54 tests passed

pnpm tsgo:extensions:test
pnpm tsgo:core:test
pnpm exec oxlint extensions/goal/src/tool.ts src/plugin-sdk/plugin-test-api.ts
git diff --check

The broader plugin-sdk-package-contract-guardrails suite still reports the known unrelated meeting-notes package-subpath baseline issue; I did not touch that in this PR.

@clawsweeper re-review

@clawsweeper

clawsweeper Bot commented May 23, 2026

Copy link
Copy Markdown
Contributor

🦞👀
ClawSweeper picked this up.

Command router queued. I will update this comment with the next step.

@clawsweeper

clawsweeper Bot commented May 23, 2026

Copy link
Copy Markdown
Contributor

🦞🧹
ClawSweeper re-review requested.

I asked ClawSweeper to review this item again.
Action: item re-review queued (workflow sweep.yml, event repository_dispatch).
Result: the existing ClawSweeper review comment will be edited in place when the review finishes.

Re-review progress:

@clawsweeper clawsweeper Bot added rating: 🌊 off-meta tidepool PR readiness rating does not apply to this item. and removed rating: 🧂 unranked krab Not merge-ready due to missing proof or serious correctness/safety concerns. status: 📣 needs proof The PR needs real behavior proof before ClawSweeper can clear the contributor ask. labels May 23, 2026
@wmeerendonk

Copy link
Copy Markdown

For the reviewer, this PR depends on #85722 which was closed by @clawsweeper for being superseded by another PR which was about a completely different topic and bears no similarity (as far as I can tell). If this PR is interesting enough, please reopen it or request that it's pulled into this PR

@scoootscooob scoootscooob force-pushed the goal-command-minimal branch from 3abbf4a to d9650ac Compare May 23, 2026 23:34
@openclaw-barnacle openclaw-barnacle Bot added the docs Improvements or additions to documentation label May 23, 2026
@socket-security

socket-security Bot commented May 23, 2026

Copy link
Copy Markdown

All alerts resolved. Learn more about Socket for GitHub.

This PR previously contained dependency changes with security issues that have been resolved, removed, or ignored.

View full report

@clawsweeper clawsweeper Bot added proof: sufficient ClawSweeper judged the real behavior proof convincing. rating: 🧂 unranked krab Not merge-ready due to missing proof or serious correctness/safety concerns. and removed rating: 🌊 off-meta tidepool PR readiness rating does not apply to this item. labels May 23, 2026
@PollyBot13

Copy link
Copy Markdown
Author

@clawsweeper re-review

Current head 911c91585c7dbc2c971aa2d5c6fa25e7327fec1f has the post-OpenGrep path-guard fix. Source-side rollup is clean, including Opengrep OSS success and the PR-diff OpenGrep job success. The only failed current-head check is checks-windows-node-test, which fails during runner setup before tests: requested Node 24.x, active Blacksmith Windows Node is 22.19.0 at /c/Program Files/nodejs/node. The later real-behavior proof run was cancelled, not source-failed.

Please re-review/update the durable PR state for this head; this looks infra-side rather than an author-code failure.

@clawsweeper

clawsweeper Bot commented May 26, 2026

Copy link
Copy Markdown
Contributor

🦞🧹
ClawSweeper re-review requested.

I asked ClawSweeper to review this item again.
Action: item re-review queued (workflow sweep.yml, event repository_dispatch).
Result: the existing ClawSweeper review comment will be edited in place when the review finishes.

@PollyBot13

Copy link
Copy Markdown
Author

@clawsweeper re-review

Current head 911c91585c7dbc2c971aa2d5c6fa25e7327fec1f now has the exact PR-body proof section ClawSweeper asked for:

  • section heading: ## Real behavior proof (required for external PRs)
  • required field labels are present
  • latest Real behavior proof check on this head passed: https://github.com/openclaw/openclaw/actions/runs/26467180908/job/77930529274
  • latest auto-response check on this head passed
  • source-side CI/security/doc/type/build rollup is clean; the only red current-head check is checks-windows-node-test, which fails during Node setup before tests (REQUESTED_NODE_VERSION=24.x, active Windows runner Node 22.19.0 at /c/Program Files/nodejs/node)

I am not asking for another Telegram/Mantis proof yet because the structural proof gate is now passing and the existing Mantis failure was infrastructure-side: Crabbox could not start a Telegram Desktop lease, so no product behavior was exercised. If exact-current-head transport proof is still required after this re-review, I can run a fresh canary proof and/or retry Telegram proof with the lease issue addressed.

@clawsweeper

clawsweeper Bot commented May 26, 2026

Copy link
Copy Markdown
Contributor

🦞👀
ClawSweeper picked this up.

Command router queued. I will update this comment with the next step.

Re-review progress:

@clawsweeper

clawsweeper Bot commented May 27, 2026

Copy link
Copy Markdown
Contributor

🦞🧹
ClawSweeper re-review requested.

I asked ClawSweeper to review this item again.
Action: item re-review queued (workflow sweep.yml, event repository_dispatch).
Result: the existing ClawSweeper review comment will be edited in place when the review finishes.

Re-review progress:

@clawsweeper

clawsweeper Bot commented May 27, 2026

Copy link
Copy Markdown
Contributor

🦞🧹
ClawSweeper re-review requested.

I asked ClawSweeper to review this item again.
Action: item re-review queued (workflow sweep.yml, event repository_dispatch).
Result: the existing ClawSweeper review comment will be edited in place when the review finishes.

Re-review progress:

@clawsweeper

clawsweeper Bot commented May 27, 2026

Copy link
Copy Markdown
Contributor

🦞🧹
ClawSweeper re-review requested.

I asked ClawSweeper to review this item again.
Action: item re-review queued (workflow sweep.yml, event repository_dispatch).
Result: the existing ClawSweeper review comment will be edited in place when the review finishes.

Re-review progress:

@socket-security

Copy link
Copy Markdown

Review the following changes in direct dependencies. Learn more about Socket for GitHub.

Diff Package Supply Chain
Security
Vulnerability Quality Maintenance License
Addednpm/​@​a2ui/​lit@​0.10.07910010094100
Addednpm/​@​anthropic-ai/​vertex-sdk@​0.16.19910010096100

View full report

@steipete

Copy link
Copy Markdown
Contributor

Thanks for pushing this forward. I am going to close this PR and reimplement the feature from the core/thread-goal boundary instead of landing the current bundled-plugin shape.

The useful product idea is right, but after comparing it with the Codex goal implementation, this needs to be owned by session/runtime state rather than by a plugin-local JSON store and Cron-backed continuation lease. In Codex, goals are persisted thread state with runtime-owned accounting, hidden continuation context, app/server notifications, and model tools restricted to create/get/update. This PR instead makes /goal the owner of state and scheduling, exposes a broader goal_status status surface to the model, and adds a public Plugin SDK lease API for a feature that should be a core session primitive.

I am also not landing this branch mechanically because it is currently conflicting with main across generated SDK baselines, docs nav, lockfile, plugin SDK types, and files that have since moved or been removed.

I will preserve the good parts of the behavior in a replacement PR: core-owned persisted goal state, runtime-owned continuation/accounting, a thin /goal control surface, focused model tools, and fresh tests/proof against current main. Thanks again for the exploration here; it made the right boundary much clearer.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agents Agent runtime and tooling dependencies-changed PR changes dependency-related files docs Improvements or additions to documentation feature: ✨ showcase ClawSweeper spotlight: unusually compelling feature idea for maintainer attention. mantis: telegram-visible-proof Mantis should capture Telegram visible proof. merge-risk: 🚨 compatibility 🚨 May break existing users, config, migrations, defaults, or upgrade paths. merge-risk: 🚨 message-delivery 🚨 May drop, duplicate, misroute, suppress, or wrongly target messages. merge-risk: 🚨 session-state 🚨 May lose, corrupt, stale, or mis-associate session, agent, or context state. P2 Normal backlog priority with limited blast radius. proof: supplied External PR includes structured after-fix real behavior proof. rating: 🐚 platinum hermit Good normal PR readiness with ordinary maintainer review expected. scripts Repository scripts size: XL status: 👀 ready for maintainer look ClawSweeper has no concrete contributor-facing blocker left for this PR.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants