Skip to content

docs(spec): SPEC-HF-PUBLISH-001 — canonical HF model publish pipeline#1780

Merged
noahgift merged 5 commits into
mainfrom
docs/model-publish-pipeline-spec
May 18, 2026
Merged

docs(spec): SPEC-HF-PUBLISH-001 — canonical HF model publish pipeline#1780
noahgift merged 5 commits into
mainfrom
docs/model-publish-pipeline-spec

Conversation

@noahgift

Copy link
Copy Markdown
Contributor

Summary

Codifies the publish workflow proven by paiml/albor-370m-v1 (MODEL-2 §88 ship, 2026-05-18) so future model publishes follow it end-to-end.

What this PR adds

New canonical spec: docs/specifications/aprender-train/model-hf-publish-pipeline-spec.md

  • The 12-file minimum a complete HF model repo needs (README, LICENSE, config, generation_config, tokenizer.json, tokenizer_config, vocab+merges, model.safetensors alias, named .safetensors, .apr, .gguf)
  • The YAML front-matter schema with the empty-model-index rejection rule
  • The file-source workflow (pull upstream tokenizer + companions, apr stamp --tokenizer, apr export)
  • The model.safetensors alias pattern via LFS dedup (free duplicate filename)
  • Manual companion-file upload via NDJSON commits (until apr publish's find_model_files is extended — known follow-up)
  • Three-path E2E verification: apr run + HF Transformers AutoModelForCausalLM.from_pretrained + llama-cli
  • HF page-render audit as a Python script gating on required keys + file presence
  • 12-tier crates.io cascade publish order (proven by v0.34.0's 67-crate cascade)
  • HF API gotchas as load-bearing rules: NDJSON + lfsFile key, LFS batch for 5MB–5GB, Xet for >5 GiB, empty model-index → HTTP 400, Q4_K K%256==0, APR-native shape to quantize_q4_k_matrix

New cascade script: scripts/cascade-publish.sh

  • Walks the 12-tier cascade (Tier 1 leaves through Tier 13 satellites)
  • --check mode reports which crates are still behind a target version (no publish)
  • Bypasses the make publish .cargo/config.toml interaction quirk observed in the v0.34.0 cascade by using direct cargo publish -p <crate> --allow-dirty --locked
  • Retries deferred crates after each tier completes

Parent spec updates

  • ship-model-2-spec.mdv1.3.0; status flips to 100% SHIPPED; new §84 amendment documents the ship + the PMAT-690 P3-C-prep defect cascade and points at SPEC-HF-PUBLISH-001 as the codified output
  • ship-two-models-spec.mdv4.1.0; MODEL-2 row → live HF link; new SPEC-HF-PUBLISH-001 row in the layout table

Why this needs to land

Every defect in the PMAT-690 P3-C-prep cascade (1, 2, 3, 5a, 5b, 5c) was discovered for the first time during the MODEL-2 publish. Without this spec, the next model publish would re-discover them. The spec converts that one-time pain into one-time documentation.

Test plan

  • scripts/cascade-publish.sh --check runs cleanly against current main (returns ✅ ALL at 0.34.0)
  • Spec links resolve (model-hf-publish-pipeline-spec.md, ship-model-2-spec.md §84, GH releases)
  • No code changes — pure docs + a bash script
  • Optional: dry-run the spec on a follow-up model (e.g., re-ship MODEL-1 through the pipeline) to validate the 12-file checklist

🤖 Generated with Claude Code

noahgift and others added 2 commits May 18, 2026 06:39
…236-M242 Branch B

Companion-repo M234 (PR #219, squash eba20ef) discovered the M224 Arena
bench result cited in the prior 2026-05-16 status_history was built on
FOUR stacked harness bugs and is RETRACTED.

Directional verdict (StaticFalsified) stands; supporting evidence is
now M234 (valid, post-harness-rework). CCPA-008 ADVISORY + M230 soft-
deprecation unchanged.

Adds top status_history entry recording:
- 4-bug-stack post-mortem
- M234 valid result: claude 1/5, apr 0/5, oracle_passed_rate = 0.10
- M236-M242 Branch B sequence (CCPA-019 gate, baseline confounder fix,
  stream-json parser, calibration-and-scale corpus)

Original M224-citing entry preserved verbatim as audit trail.

APPEND-ONLY: no version bump, no invariants[]/falsification_conditions[]
edit. CCPA-019 registration deferred to future v1.31.0 PR.

pv validate clean.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
New spec at docs/specifications/aprender-train/model-hf-publish-pipeline-spec.md
codifies the workflow proven by the paiml/albor-370m-v1 (MODEL-2 §88) ship
on 2026-05-18 so future model publishes follow it end-to-end.

What the new spec covers:

- The 12-file minimum: README.md, LICENSE, config.json, generation_config.json,
  tokenizer.json, tokenizer_config.json, vocab.json, merges.txt, model.safetensors
  (HF Transformers alias via LFS dedup), <name>.safetensors, <name>.apr, <name>-q4k.gguf.
- The YAML front-matter schema with the model-index rule (never emit empty results;
  PMAT-690 defect 5c).
- File-source workflow: pull upstream tokenizer + LICENSE + companions, then
  `apr stamp --tokenizer <DIR>` (defect 1), `apr export --quantize int4` (defect 2),
  `apr publish` (defects 5a/5b).
- The `model.safetensors` alias pattern (LFS dedup → free duplicate filename).
- Manual companion-file upload until apr publish's file-selection gap is fixed
  (P3-C-prep defect 6 follow-up).
- Three-path E2E verification: `apr run` + HF Transformers + llama-cli.
- HF page-render audit (Python script gating on required keys + files).
- 12-tier crates.io cascade publish order (proven by v0.34.0's 67-crate cascade).
- HF API gotchas as load-bearing rules: NDJSON+lfsFile, LFS batch for 5MB-5GB,
  Xet for >5 GiB, empty model-index rejected, Q4_K K%256==0, APR-native shape.

Plus scripts/cascade-publish.sh — the cascade walker (with --check mode for
status reports), bypasses the make-publish .cargo/config.toml interaction
quirk observed in the v0.34.0 cascade.

Parent spec updates:

- ship-model-2-spec.md: bumped to v1.3.0; header status → 100% SHIPPED;
  added §84 amendment documenting the ship + the cascade lessons.
- ship-two-models-spec.md: bumped to v4.1.0; MODEL-2 row → published link;
  new SPEC-HF-PUBLISH-001 row added to the layout table.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@noahgift noahgift enabled auto-merge (squash) May 18, 2026 05:01
noahgift and others added 3 commits May 18, 2026 07:40
….34.0 banner

User-authored audit content (Popperian falsification + Sorscher 2022 +
Five-Whys) added to docs/specifications/audits/albor-370.md and
docs/specifications/two-model-spec-audit.md. Cross-linked from both
SPEC-HF-PUBLISH-001 §"Lineage / first applied" and ship-model-2-spec.md
§84.3 so future readers find the falsification rigor alongside the
ship narrative.

Top-level README.md updated:
- v0.34.0 banner replaces v0.33.0 banner, references MODEL-2 §88 ship +
  links to SPEC-HF-PUBLISH-001 and v0.34.0 release notes.
- New "Publishing a model?" callout under CLI examples with the 3-step
  stamp → export → publish recipe and a deep link to the spec.
- Library-usage Cargo.toml bumped to aprender = "0.34".

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@noahgift noahgift merged commit 52c52c9 into main May 18, 2026
10 checks passed
@noahgift noahgift deleted the docs/model-publish-pipeline-spec branch May 18, 2026 06:32
noahgift added a commit that referenced this pull request May 18, 2026
…alias (PMAT-690 P3-C-prep defect 6) (#1783)

Closes the file-selection gap surfaced during the paiml/albor-370m-v1
ship — `apr publish` previously only picked .apr/.safetensors/.gguf
extensions, leaving the operator to manually NDJSON-commit every
companion file (README, LICENSE, config.json, tokenizer.json,
tokenizer_config.json, vocab.json, merges.txt, generation_config.json,
special_tokens_map.json, chat_template.jinja).

What this PR adds
================

1. `find_companion_files(directory)` — case-sensitive exact-filename
   allowlist of HF-standard integration files. Returns paths to
   whichever are present in the staging directory. Decoys (arbitrary
   .json or .txt files outside the allowlist) and binary artifacts
   (.apr/.safetensors/.gguf) are NOT picked up.

2. User-provided README.md preference — when one is present in the
   companion set, its content is used verbatim as the model card
   instead of the auto-generated stub. The auto-generated stub was
   consistently weaker than what model authors hand-craft (observed
   on the albor-370m-v1 publish: 164-byte auto-stub vs 11.6KB hand-
   crafted card).

3. `model.safetensors` LFS alias auto-emit — when a `.safetensors`
   file is uploaded under a descriptive name (e.g.,
   `albor-370m-v1.safetensors`), a second NDJSON `lfsFile` commit
   emits the alias `model.safetensors` pointing at the same OID.
   HF deduplicates LFS blobs by OID so the alias is storage-free.
   Required for HF Transformers `AutoModelForCausalLM.from_pretrained`
   to auto-discover the weights without an explicit weights_file
   argument.

4. New public method `HfHubClient::commit_lfs_alias` in aprender-core
   — wraps the existing NDJSON commit-lfs-pointer path so the
   apr-cli publish command can emit the alias commit.

Reference implementation
========================

Follows SPEC-HF-PUBLISH-001 (committed 2026-05-18 in #1780):
- §"Required artifacts (12 files minimum)" — companion files list
- §"Publishing the `model.safetensors` alias" — alias protocol

Removes the manual NDJSON commit pattern documented in the spec's
§"Manual companion-file upload until publish CLI is fixed" — that
section can now be marked stale + linked to this PR.

Tests
=====

4 new unit tests in publish_tests.rs:
- find_companion_files picks all 10 allowlist entries when present
- find_companion_files skips decoys + binary artifacts
- find_companion_files empty dir returns empty
- safetensors_needing_alias triggers on descriptive names
- safetensors_needing_alias skips canonical model.safetensors
- safetensors_needing_alias skips .apr/.gguf-only publishes

All 35 commands::publish::tests pass.

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant