feat(apr-publish): LFS batch upload + NDJSON commits + valid model-index YAML (PMAT-690 defect 5)#1772
Merged
Merged
Conversation
…dex YAML (PMAT-690 P3-C-prep defect 5)
Surfaced publishing paiml/albor-370m-v1 on 2026-05-17. apr publish
returned "✓ Published" yet the repo showed only .gitattributes. Three
distinct sub-defects cascaded together:
5a. LFS batch upload missing for 5MB–5GB files
============================================
HF's preupload endpoint returns `uploadMode: "lfs"` with no inline
URLs for files in the 5MB–5GB band, expecting the client to fetch the
presigned S3 URL via the LFS Batch API
(POST `/{repo}.git/info/lfs/objects/batch`). Our upload_via_lfs skipped
this step entirely and went straight to commit_lfs_pointer, landing
orphaned pointers (the Xet branch handles >5GiB; nothing covered the
gap below it).
Fix: new `upload_via_lfs_batch` method ported from aprender-data's
working flow. Calls batch API → parses `objects[0].actions.upload.href`
→ PUTs the blob with optional headers → optional verify POST.
Empirical: paiml/albor-370m-v1 .apr (2.52GB) + .gguf (2.17GB) +
.safetensors (2.52GB) all upload at ~67 MB/s.
5b. JSON `addOrUpdate` commit returns 200 but drops files
=========================================================
The memory rule `feedback_hf_commit_ndjson_load_bearing.md`
(2026-04-18) — "HF commit MUST use application/x-ndjson + lfsFile key"
— was already known, but only enforced in aprender-data. The model
publish path in aprender-core used JSON with `op: "addOrUpdate"` for
BOTH the LFS-pointer commit and the small-file commit. HF returned
HTTP 200 + `success: true` for both, but actually persisted nothing
beyond `.gitattributes`.
Fix:
- commit_lfs_pointer now sends NDJSON `{key: "header"} \n {key: "lfsFile", value: {path, algo: "sha256", oid, size}}` with `Content-Type: application/x-ndjson`.
- upload_direct now sends NDJSON `{key: "header"} \n {key: "file", value: {path, content (base64), encoding: "base64"}}` (matches build_ndjson_upload_payload in aprender-data).
5c. Auto-generated README rejected by HF (HTTP 400)
====================================================
ModelCard::to_huggingface unconditionally emits `model-index:` with a
name but only emits `results:` if metrics are non-empty. HF's metadata
validator requires `model-index[0].results`, so the auto-generated
README was rejected with:
"model-index[0].results" is required
Fix: skip the entire `model-index:` block when metrics are empty.
Empty model-index is invalid and signal-free anyway.
End-to-end verification
=======================
After all three fixes:
$ apr publish /tmp/albor-370m-staging paiml/albor-370m-v1 ...
✓ Published
$ curl ".../api/models/paiml/albor-370m-v1/tree/main"
1632 .gitattributes
164 README.md
2166458784 albor-370m-v1-q4k.gguf (lfs=True)
2524492804 albor-370m-v1.apr (lfs=True)
2520702380 albor-370m-v1.safetensors (lfs=True)
All 3 LFS artifacts + auto-generated README on the repo.
Known gaps (filed as follow-up)
================================
- find_model_files (apr-cli/src/commands/publish.rs:496) only picks
.apr/.safetensors/.gguf — companion files (config.json, vocab.json,
merges.txt, user-provided README.md from staging dir) are skipped.
- The 11.6KB user-authored model card in /tmp/albor-370m-staging/
README.md was ignored in favor of the auto-generated 164-byte stub.
These are file-selection defects, not upload-correctness — separate PR.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…ust 1.93) Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Three cascading defects in apr publish that made the publish silently fail. Surfaced publishing paiml/albor-370m-v1 on 2026-05-17:
upload_via_lfsskipped the LFS batch API step entirely for files that HF returns withuploadMode: lfsand no inline URL (the band between regular preupload and Xet's 5GiB threshold).commit_lfs_pointerandupload_directused JSON `addOrUpdate` operations. HF replies `success: true` but silently discards the file. Memory rulefeedback_hf_commit_ndjson_load_bearing.md(2026-04-18) already mandated NDJSON + lfsFile, but the rule was only enforced in aprender-data.After all three fixes, a fresh publish lands the LFS artifacts + auto-generated README:
```
1632 .gitattributes
164 README.md
2166458784 albor-370m-v1-q4k.gguf (lfs=True)
2524492804 albor-370m-v1.apr (lfs=True)
2520702380 albor-370m-v1.safetensors (lfs=True)
```
paiml/albor-370m-v1 is now live on HF Hub with all 3 binary artifacts.
Detail
5a —
upload_via_lfs_batch(new)Implements the standard LFS Batch API flow:
5b — NDJSON commit format
Both file-commit paths now send `application/x-ndjson` with two lines:
```json
{"key":"header","value":{"summary":"...","description":""}}
{"key":"lfsFile","value":{"path":"...","algo":"sha256","oid":"...","size":...}}
```
(For small files: `{"key":"file","value":{"path":"...","content":"base64...","encoding":"base64"}}`.)
5c — Conditional model-index block
```rust
if !self.metrics.is_empty() {
output.push_str("model-index:\n");
// ... only emit results when there are metrics to report
}
```
Known gaps (separate follow-up)
`find_model_files` in apr-cli/src/commands/publish.rs only picks .apr/.safetensors/.gguf — config.json, vocab.json, merges.txt, and user-provided README.md from the staging dir are skipped. The 11.6KB user-authored model card was ignored in favor of the auto-generated 164-byte stub. This is a file-selection defect, not upload-correctness — separate PR.
Test plan
model-indexYAML behavior with empty + non-empty metrics🤖 Generated with Claude Code