Skip to content

feat: add crawl archive skills#83040

Merged
steipete merged 3 commits into
mainfrom
feat/crawl-archive-skills
May 17, 2026
Merged

feat: add crawl archive skills#83040
steipete merged 3 commits into
mainfrom
feat/crawl-archive-skills

Conversation

@steipete

@steipete steipete commented May 17, 2026

Copy link
Copy Markdown
Contributor

Summary

Verification

  • Parsed SKILL.md frontmatter and agents/openai.yaml for discrawl, slacrawl, graincrawl, notcrawl, gitcrawl.
  • git diff --check
  • node scripts/run-vitest.mjs src/agents/skills/frontmatter.test.ts
  • clawhub inspect discrawl|slacrawl|graincrawl|notcrawl|gitcrawl --json shows owner openclaw; discrawl/gitcrawl latest 1.0.0, slacrawl/graincrawl/notcrawl latest 1.0.1.

Behavior addressed: stock OpenClaw crawler skills now have canonical repo-local definitions, OpenClaw repo/module metadata, and root ClawHub slugs.
Real environment tested: local OpenClaw checkout and live ClawHub registry.
Exact steps or command run after this patch: frontmatter YAML parse, git diff --check, node scripts/run-vitest.mjs src/agents/skills/frontmatter.test.ts, clawhub inspect for all five root slugs.
Evidence after fix: each root slug resolves under openclaw; Slacrawl, Graincrawl, and Notcrawl latest metadata versions point at OpenClaw repos/modules.
Observed result after fix: root installs/searches resolve to OpenClaw-owned stock skill entries.
What was not tested: installing each skill through a fresh OpenClaw runtime.

@openclaw-barnacle openclaw-barnacle Bot added size: M maintainer Maintainer-authored PR labels May 17, 2026
@clawsweeper

clawsweeper Bot commented May 17, 2026

Copy link
Copy Markdown
Contributor

Codex review: needs real behavior proof before merge.

Summary
The PR adds bundled Discrawl, Slacrawl, Graincrawl, and Notcrawl skill definitions, compacts Gitcrawl’s bundled skill, and adds OpenAI agent metadata/install recipes for all five crawler tools.

Reproducibility: not applicable. This PR adds optional bundled skill definitions rather than fixing a reported current-main bug. The reviewable behavior is source-verifiable metadata plus installer provenance, not a user reproduction path.

Real behavior proof
Not applicable: The PR has a protected maintainer label, so the external-contributor real behavior proof gate does not apply; the body provides ClawHub inspect proof but no fresh runtime install smoke.

Next step before merge
Protected maintainer-labeled bundled-skill promotion and executable install metadata need human product/security approval, not an automated repair or cleanup close.

Security
Cleared: No concrete security or supply-chain defect found in the current head; the executable Go install recipes now use OpenClaw-owned module paths and existing installer validation.

Review details

Best possible solution:

Maintainers should decide whether these ClawHub-published crawler skills warrant bundled promotion; if yes, keep the canonical OpenClaw module metadata and add a disposable install smoke before landing.

Do we have a high-confidence way to reproduce the issue?

Not applicable: this PR adds optional bundled skill definitions rather than fixing a reported current-main bug. The reviewable behavior is source-verifiable metadata plus installer provenance, not a user reproduction path.

Is this the best way to solve the issue?

Unclear until maintainer approval: the current patch is internally coherent after the module-path fix, but VISION.md makes bundled promotion a human product/security decision over the ClawHub-only path.

Acceptance criteria:

  • git diff --check
  • node scripts/run-vitest.mjs src/agents/skills/frontmatter.test.ts
  • Run a disposable OpenClaw runtime skills.install smoke for discrawl, slacrawl, graincrawl, notcrawl, and gitcrawl.

What I checked:

  • Protected PR context: The provided live PR context shows labels maintainer, P3, and impact:security; repository policy says protected-label items stay open for explicit maintainer handling.
  • Diff scope: The PR diff against its base changes only .agents/skills crawler skill definitions and OpenAI metadata files, with no core runtime code changes. (.agents/skills/gitcrawl/SKILL.md:1, 51acf671a516)
  • Bundled-skill policy: Current VISION.md says new skills should publish through ClawHub first and bundled promotion needs a clear product, security, or maintainer-ownership reason. (VISION.md:80, 51e93669cb4f)
  • Prior installer provenance finding was addressed: The PR follow-up commit changed Slacrawl, Graincrawl, and Notcrawl homepage/module metadata from personal namespaces to canonical github.com/openclaw/* paths. (.agents/skills/slacrawl/SKILL.md:12, 51acf671a516)
  • Installer execution contract: Current main validates Go install specs with an allowlisted pattern and executes them as argv via go install, so the PR metadata is the executable provenance reviewed here. (src/agents/skills-install.ts:147, 51e93669cb4f)
  • Frontmatter parsing contract: Current main normalizes Go module metadata and rejects empty, dash-prefixed, URL, backslash, or invalid-pattern module specs before exposing skill install options. (src/agents/skills/frontmatter.ts:60, 51e93669cb4f)

Likely related people:

  • vincentkoc: Current-main blame attributes the existing bundled Gitcrawl skill, skill installer validation, frontmatter Go-module parsing, and VISION bundled-skill policy lines to the same recent area commit, making this the best routing signal for bundled skill/install policy. (role: recent area contributor; confidence: medium; commits: 3918d6958718; files: .agents/skills/gitcrawl/SKILL.md, src/agents/skills-install.ts, src/agents/skills/frontmatter.ts)

Remaining risk / open question:

  • Bundled promotion is still a product/security decision because the repo policy prefers ClawHub-first distribution for new skills.
  • The PR body does not include a fresh disposable OpenClaw skills.install smoke for the five crawler tools.

Codex review notes: model gpt-5.5, reasoning high; reviewed against 51e93669cb4f.

@clawsweeper clawsweeper Bot added P3 Low-priority cleanup, docs, polish, ergonomics, or speculative work. impact:security Security boundary, credential, authz, sandbox, or sensitive-data risk. labels May 17, 2026
@steipete steipete force-pushed the feat/crawl-archive-skills branch from 51acf67 to 77e9b4e Compare May 17, 2026 11:13

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 77e9b4e441

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

- slacrawl
install:
- kind: go
module: github.com/vincentkoc/slacrawl/cmd/slacrawl@latest

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Point installer modules at OpenClaw-owned crawl repos

The new crawl skills advertise OpenClaw ownership via metadata.openclaw.homepage, but the install module here (and the matching lines in graincrawl/notcrawl) still points to github.com/vincentkoc/.... Because the installer uses this module path to fetch binaries, users can end up installing from a different owner than the declared homepage, which breaks provenance and can pull stale/divergent code if those repos drift. Update these module values to the OpenClaw module paths to keep install source and ownership metadata consistent.

Useful? React with 👍 / 👎.

@steipete steipete merged commit 4aa671b into main May 17, 2026
88 of 89 checks passed
@steipete steipete deleted the feat/crawl-archive-skills branch May 17, 2026 11:18
@steipete

Copy link
Copy Markdown
Contributor Author

Landed via temp rebase onto main.

Behavior addressed: bundled crawl archive skills added for Discrawl, Gitcrawl, Graincrawl, Notcrawl, and Slacrawl; install recipes corrected to declared Go module paths before merge.
Real environment tested: local macOS checkout plus Blacksmith Testbox tbx_01krtt4ywtxs5fxwftmyewxaaq; disposable Go installs into temp GOBIN.
Exact steps or command run after this patch:

  • git diff --check temp/landpr-83040...HEAD
  • YAML frontmatter parse for changed SKILL.md and agents/openai.yaml files via Ruby Psych.safe_load
  • node scripts/run-vitest.mjs src/agents/skills/frontmatter.test.ts src/agents/skills.bundled-frontmatter.test.ts src/agents/skills.buildworkspaceskillstatus.test.ts src/agents/skills-install.test.ts
  • go install smoke, one module per invocation, into /tmp/openclaw-crawl-skills.OaZzIe
  • pnpm check:changed delegated to Blacksmith Testbox tbx_01krtt4ywtxs5fxwftmyewxaaq
    Evidence after fix: discrawl, gitcrawl, graincrawl, notcrawl, and slacrawl binaries all appeared in temp GOBIN; focused Vitest passed 7 files / 41 tests; Testbox changed-gate exited 0; GitHub CI for the rebased head was green except the non-blocking Real behavior proof bot status.
    Observed result after fix: PR merged with rebase.
    What was not tested: no live archive sync against Discord, Slack, Granola, Notion, or GitHub account data.

Source PR head: 77e9b4e
Landed commits:

galiniliev pushed a commit to galiniliev/openclaw that referenced this pull request May 20, 2026
SebTardif pushed a commit to SebTardif/openclaw that referenced this pull request May 24, 2026
SebTardif pushed a commit to SebTardif/openclaw that referenced this pull request May 24, 2026
SebTardif pushed a commit to SebTardif/openclaw that referenced this pull request May 24, 2026
github-actions Bot pushed a commit to Desicool/openclaw that referenced this pull request May 24, 2026
galiniliev pushed a commit to galiniliev/openclaw that referenced this pull request May 25, 2026
SebTardif pushed a commit to SebTardif/openclaw that referenced this pull request May 26, 2026
SebTardif pushed a commit to SebTardif/openclaw that referenced this pull request May 26, 2026
SebTardif pushed a commit to SebTardif/openclaw that referenced this pull request May 26, 2026
jameslcowan pushed a commit to jameslcowan/openclaw that referenced this pull request Jun 2, 2026
SYU8384 pushed a commit to SYU8384/openclaw that referenced this pull request Jun 3, 2026
sablehead pushed a commit to sablehead/openclaw that referenced this pull request Jun 10, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

impact:security Security boundary, credential, authz, sandbox, or sensitive-data risk. maintainer Maintainer-authored PR P3 Low-priority cleanup, docs, polish, ergonomics, or speculative work. size: M

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant