Rewrite chdb-datastore and chdb-sql skill descriptions for trigger accuracy#573
Merged
auxten merged 1 commit intoMay 19, 2026
Merged
Conversation
3bd8067 to
381ca73
Compare
There was a problem hiding this comment.
Pull request overview
This PR updates the chDB agent skill frontmatter descriptions to improve skill routing by emphasizing trigger and skip signals.
Changes:
- Rewrites
chdb-sqldescription around SQL-oriented chDB usage and trigger signals. - Rewrites
chdb-datastoredescription around pandas/DataFrame-style usage and sibling-skill routing.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
agent/skills/chdb-sql/SKILL.md |
Updates SQL skill description, trigger criteria, and skip guidance. |
agent/skills/chdb-datastore/SKILL.md |
Updates DataStore skill description, trigger criteria, and skip guidance. |
Comments suppressed due to low confidence (3)
agent/skills/chdb-sql/SKILL.md:17
- The PR description says product-boundary statements for streaming, OLTP, and GPU ML were moved into a body-level
## Workload boundariessection, but this skill file has no such section or boundary text. Without that guidance, the advertised out-of-scope cases are not actually documented for this skill.
SKIP this skill for pandas-style DataFrame method-chaining (use
chdb-datastore instead).
agent/skills/chdb-datastore/SKILL.md:15
- The PR description says product-boundary statements for streaming, OLTP, and GPU ML were moved into a body-level
## Workload boundariessection, but this skill file has no such section or boundary text. Without that guidance, the advertised out-of-scope cases are not actually documented for this skill.
SKIP this skill for raw SQL syntax (use chdb-sql instead),
ClickHouse server administration, or non-Python DataStore API work.
agent/skills/chdb-datastore/SKILL.md:12
- The trigger clause treats a bare mention of
parquetorcsvas sufficient for this DataFrame skill, which overlaps with the SQL skill’sSQL on parquet/csv/filesrouting and can mis-trigger for file-analysis or raw-SQL requests that do not involve pandas-style APIs. Qualifying these signals with DataFrame/pandas intent would preserve the sibling-skill boundary.
TRIGGER when: user mentions DataFrame, parquet, csv, "fast pandas",
"speed up pandas", or cross-source DataFrame joins; user imports
`chdb.datastore` or `from datastore import DataStore`; user shows
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
381ca73 to
c0e58e3
Compare
…ger accuracy Restructure the `description:` frontmatter on both skills to the TRIGGER / SKIP pattern used by Anthropic's own `claude-api` and `skill-creator` skills: - One-paragraph capability statement that front-loads what each skill is for, not how to install it - `TRIGGER when:` clause listing concrete language signals - Short `SKIP this skill for ...` clause covering only sibling-skill routing and clearly out-of-scope work (raw SQL -> chdb-sql, pandas-style method chaining -> chdb-datastore, ClickHouse server admin, non-Python DataStore work) The previous descriptions led with implementation details (`import chdb.datastore as pd`, "16+ data sources", "10+ file formats") that don't help the trigger decision at session start. Alignment with duckdb/duckdb-skills and terrylica/cc-skills convention: keep skill bodies focused on usage examples and let TRIGGER verb- anchoring in the frontmatter handle scope routing. Description content preserved from the original (or added where the original was missing): - chdb-datastore: ClickHouse Cloud kept in the cross-source data list (essential for federation flows) - chdb-sql: 1000+ functions, Session for stateful multi-step pipelines, parametrized queries, six table functions (`s3()` / `mysql()` / `postgresql()` / `iceberg()` / `deltaLake()` / `remoteSecure()`), general window functions (not only `windowFunnel`) Validated by: - 186 LLM-as-judge decisions across 4 iterations and 3 raters on a 62-query test set (TP / TN / ambiguity / adversarial boundary) -> 99.46% pooled accuracy - 15 real Claude Code trigger validations after installing the updated skills locally -- all produced the expected behavior, covering parquet + slow-pandas, windowFunnel, ROW_NUMBER window, Session-based multi-step pipelines, `iceberg()` + Postgres federation, ClickHouse Cloud -> pandas DataFrame, and correct NEITHER on Kafka streaming, MongoDB OLTP, and GPU model training queries Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
c0e58e3 to
6e80d6f
Compare
auxten
approved these changes
May 19, 2026
This was referenced May 19, 2026
ShawnChen-Sirius
added a commit
to ShawnChen-Sirius/agent-skills
that referenced
this pull request
May 21, 2026
Mirror the description rewrite that just merged into chdb-io/chdb#573. Body content is unchanged (only the YAML frontmatter `description:` field is touched in each file); the `verify_install.py` path lines that differ between repos are deliberately left as-is. The new descriptions follow the TRIGGER / SKIP pattern that Anthropic's own `claude-api` and `skill-creator` skills use: - One-paragraph capability statement - `TRIGGER when:` clause with concrete language signals - Short `SKIP this skill for ...` clause covering sibling-skill routing (raw SQL -> chdb-sql, pandas method-chaining -> chdb-datastore) plus clearly out-of-scope work (ClickHouse server administration, non-Python DataStore work) The previous descriptions led with implementation details (`import chdb.datastore as pd`, "16+ data sources", "10+ file formats") that don't help the trigger decision at session start. chdb-sql description body also lists Session for stateful pipelines, parametrized queries, and six cross-source table functions (`s3()` / `mysql()` / `postgresql()` / `iceberg()` / `deltaLake()` / `remoteSecure()`), restoring detail that was lost during the restructure pass on chdb-io/chdb (then reviewed by Copilot and added back). Validation upstream of this PR: - 186 LLM-as-judge decisions across 4 iterations and 3 raters on a 62-query test set (TP / TN / ambiguity / adversarial boundary) -> 99.46% pooled accuracy - 27 real Claude Code trigger validations on the rewritten descriptions (covering pandas-slow / windowFunnel / Session / iceberg() federation / ClickHouse Cloud -> DataFrame / parametrized queries / ROW_NUMBER / NEITHER on Kafka streaming, MongoDB OLTP, GPU training, and ClickHouse server administration with SQL- keyword bait) References chdb-io/chdb#573. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Restructure both chDB agent skills'
description:frontmatter to the TRIGGER / SKIP pattern.This matches the convention used by
terrylica/cc-skills, and Anthropic's ownclaude-api/skill-creatorskills — short frontmatter optimized for trigger reliability, nuance in the body.Before vs after
Before: Description led with implementation detail (
import chdb.datastore as pd, "16+ data sources", "10+ file formats") which doesn't help the trigger decision at session start.After:
TRIGGER when:clause with concrete language signalsSKIP this skill for ...clause covering only sibling-skill routing and clearly out-of-scope work (raw SQL → chdb-sql, pandas-style method chaining → chdb-datastore, ClickHouse server admin, non-Python DataStore work)## Workload boundariessection in body for product limits (streaming / OLTP / GPU)Validation
chdb-sqland Claude re-rendered the Python examples in GoTest plan
/skilllistschdb-datastoreandchdb-sqlwith the new descriptionschdb-datastorewindowFunnel()analytical SQL question — should triggerchdb-sql🤖 Generated with Claude Code