fix(skills): add nutrigx_advisor compat symlink for bench harness by manuelcorpas · Pull Request #215 · ClawBio/ClawBio

manuelcorpas · 2026-05-03T08:40:24Z

Summary

Restores nutrigx-advisor end-to-end testability against clawbio-bench v0.1.5. After the AgentSkills naming rename in e4ed975 (skills/nutrigx_advisor to skills/nutrigx-advisor), the bench harness was still pointing at the legacy underscore path and all 10 nutrigx test cases were failing with exit_code: 2 ("script not found").

This adds a tracked directory symlink so the legacy path resolves until the bench is updated.

Impact on the public benchmark scorecard

Harness	Before	After
nutrigx-advisor	0 / 10 (0.0%)	10 / 10 (100.0%)
Aggregate	139 / 162 (85.8%)	149 / 162 (92.0%)

Compared to the original 2026-04-05 audit baseline of 80 / 140 (57.1%), the live scorecard is now +35 percentage points across 7 of 10 harnesses (the other 3 are new harnesses added in bench v0.1.5).

Why a symlink and not a code fix

The bench harness invokes the skill script directly (nutrigx_harness.py:511) bypassing the ClawBio CLI:

tool_path = repo_path / "skills" / "nutrigx_advisor" / "nutrigx_advisor.py"

So updating clawbio.py's SKILLS dict does not help. The bench reads from a git checkout (untracked files are invisible), which is why the symlink has to be tracked.

Why this is temporary

The proper fix is in biostochastics/clawbio_bench: either resolve skill folders dynamically (try hyphen, fall back to underscore) or read the script path from a per-skill manifest. I will open a follow-up issue / PR there. Once that lands, this symlink can be removed.

Verified locally

python clawbio.py run nutrigx --demo returns exit 0 and writes the full reproducibility bundle (commands.sh, environment.yml, checksums.txt, provenance.json, nutrigx_report.md, result.json).
clawbio-bench --smoke --harness nutrigx reports 10 / 10 passing with score_correct, snp_valid, threshold_consistent, repro_functional.
clawbio-bench --smoke aggregate run reports 149 / 162 (92.0%).

Test plan

CI green
Manual: git clone https://github.com/ClawBio/ClawBio.git && cd ClawBio && python clawbio.py run nutrigx --demo returns exit 0
Manual: clawbio-bench --smoke --harness nutrigx --repo . returns 10 / 10 passing

Follow-up (separate PR)

Open issue at biostochastics/clawbio_bench requesting dynamic skill folder resolution
Audit clawbio.py for other broken SKILLS_DIR / "name" references (one other was found: llm-biobank-bench has no folder; out of scope here)
Update benchmarks.html to reflect 149 / 162 (92.0%) once this PR is merged

🤖 Generated with Claude Code

Restores nutrigx end-to-end testability after the AgentSkills naming rename in e4ed975 (skills/nutrigx_advisor to skills/nutrigx-advisor). The clawbio_bench v0.1.5 nutrigx_harness hardcodes the legacy underscore path at nutrigx_harness.py:511, so all 10 nutrigx test cases were failing on the live repo with exit_code 2 ("script not found"). Bench reads from a git checkout, so an untracked symlink is invisible to it; this needs to be a tracked symlink committed to the tree. Effect on benchmark - nutrigx-advisor: 0/10 (0.0%) -> 10/10 (100.0%) - Aggregate (clawbio-bench v0.1.5 smoke): 139/162 (85.8%) -> 149/162 (92.0%) excluding fine-mapping infrastructure errors The symlink is a temporary compatibility shim. The proper fix is a PR to biostochastics/clawbio_bench to either resolve skill folders dynamically (try hyphen, fall back to underscore) or read the path from a per-skill manifest. Once that lands, this symlink can be removed. Verified locally - python clawbio.py run nutrigx --demo (exit 0, full repro bundle) - clawbio-bench --smoke --harness nutrigx (10/10 PASS) - clawbio-bench --smoke (149/162, 92.0%) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Updates the public benchmark leaderboard and the homepage to reflect the 2026-05-03 re-run of clawbio_bench v0.1.5, after the nutrigx compat-symlink fix in PR ClawBio#215. Headline numbers - Audit baseline (2026-04-05, bench v0.1.0): 80 / 140 (57.1%) - Latest run (2026-05-03, bench v0.1.5): 149 / 162 (92.0%) - Skills audited: 7 to 10 (3 new harnesses in v0.1.5) Per-skill changes - equity-scorer: 20.0% to 100.0% (P0 findings resolved) - nutrigx-advisor: 80.0% to 100.0% (today's symlink fix) - pharmgx-reporter: 42.4% to 97.7% - bio-orchestrator: 75.9% to 98.1% - claw-metagenomics: 85.7% to 100.0% - clinical-variant-reporter: 80.0% to 80.0% (unchanged) - fine-mapping: now reports 21 harness infrastructure errors, excluded from rate (under investigation) - New harnesses: cvr-acmg-correctness 69.2%, gwas-prs 62.5%, cvr-variant-identity 50.0% Site changes - benchmarks.html - Hero adds "We fix them" arc - Audit metadata card "Auditor: Sergey Kornilov" replaced with "Bench Author: Biostochastics LLC", focusing the credit on the open-source clawbio_bench tool rather than an individual - Bench version v0.1.0 to v0.1.5 - Audit commit 1481fb4 to 925b89a - Summary bar: 80/140 to 149/162 - Scorecard table: full rebuild against new run, 10 rows, status pills updated (most P0/P1 now Clear) - Footer: original audit baseline retained for transparency - JSON-LD dateModified bumped - index.html - Top banner: "80/140 across 7 skills" to "149/162 (92.0%) | Up from 80/140 (57.1%) at original audit" - Hero CTA pill: "Leaderboard 80/140" to "Leaderboard 149/162" - Benchmark Validated feature card: numbers refreshed - Skills section header: "7 publicly benchmarked" to "149/162 publicly benchmarked passing" - Recent Updates: leaderboard card now references PR ClawBio#215 too, cites clawbio_bench, includes the gain numbers The auditor-name swap follows the principle that the bench is a public tool with org credit; a single individual's name is not the right framing for a permanent front-page asset. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Resolves the 21 harness errors observed in clawbio_bench v0.1.5 finemapping driver. The driver does: sys.path.insert(0, skill_dir) from core.abf import compute_abf from core.susie import run_susie from core.credible_sets import ... from core.susie_inf import run_susie_inf # optional The skill restructured its internal package from `core` to `fine_mapping_core` at some point but the bench harness still expects `core`. Adds a tracked directory symlink so the legacy import path resolves until the bench is updated. Effect on benchmark - clawbio-finemapping: 0/0 with 21 harness errors -> 19/20 (95.0%) with 1 real algorithm failure (susie_inf_est_tausq_ignored) - Aggregate: 149/162 (92.0%) excluding finemapping -> 168/182 (92.3%) including finemapping Same pattern as the nutrigx_advisor symlink in PR ClawBio#215. The proper fix is in biostochastics/clawbio_bench: either resolve skill-package names dynamically or read import paths from a per-skill manifest. Once that lands, both symlinks can be removed. Verified - clawbio-bench --smoke --harness finemapping: 19/20 pass Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

manuelcorpas merged commit 925b89a into main May 3, 2026
6 checks passed

manuelcorpas deleted the fix/nutrigx-folder-path branch May 3, 2026 08:40

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(skills): add nutrigx_advisor compat symlink for bench harness#215

fix(skills): add nutrigx_advisor compat symlink for bench harness#215
manuelcorpas merged 1 commit intomainfrom
fix/nutrigx-folder-path

manuelcorpas commented May 3, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

manuelcorpas commented May 3, 2026

Summary

Impact on the public benchmark scorecard

Why a symlink and not a code fix

Why this is temporary

Verified locally

Test plan

Follow-up (separate PR)

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant