Add training skill: train-sentence-transformers#3752
Merged
Conversation
Contributor
There was a problem hiding this comment.
Pull request overview
This PR adds three Hugging Face “Agent Skills” under skills/ to enable end-to-end training workflows for sentence-transformers models (SentenceTransformer, CrossEncoder, and SparseEncoder/SPLADE), plus automation to keep shared docs in sync and to publish updates to the huggingface/skills marketplace.
Changes:
- Introduces three self-contained skills (
train-sentence-transformer,train-cross-encoder,train-sparse-encoder) with reference docs and runnable training templates. - Adds a sync workflow to mirror
skills/<name>/intohuggingface/skillson release tags (and via manual dispatch). - Adds a shared-file mirroring script (
skills/sync_shared.py) and a pre-commit hook to prevent drift across duplicated docs/scripts.
Reviewed changes
Copilot reviewed 47 out of 48 changed files in this pull request and generated 19 comments.
Show a summary per file
| File | Description |
|---|---|
| .github/workflows/sync-skills.yml | Syncs skill folders into huggingface/skills and opens an automated PR on releases/manual runs. |
| .gitignore | Ignores local agent/plugin directories used during skill development. |
| .pre-commit-config.yaml | Adds a local hook to enforce shared-doc/script synchronization across skills. |
| skills/README.md | Documents how to install/use the skills and how to develop locally. |
| skills/sync_shared.py | Copies canonical shared docs/scripts from train-sentence-transformer into the other two skills (with --check). |
| skills/train-cross-encoder/SKILL.md | Skill definition and instructions for cross-encoder (reranker) training workflows. |
| skills/train-cross-encoder/scripts/mine_hard_negatives.py | CLI wrapper for mining hard negatives to support training datasets. |
| skills/train-cross-encoder/scripts/train_distillation_example.py | Cross-encoder distillation training template. |
| skills/train-cross-encoder/scripts/train_example.py | Cross-encoder pointwise training template (with hard-negative mining). |
| skills/train-cross-encoder/scripts/train_listwise_example.py | Cross-encoder listwise training template. |
| skills/train-cross-encoder/references/dataset_formats.md | Reference guide for supported dataset shapes/formats. |
| skills/train-cross-encoder/references/evaluators.md | Reference guide for evaluation options/metrics for cross-encoders. |
| skills/train-cross-encoder/references/hardware_guide.md | Hardware guidance for running training efficiently. |
| skills/train-cross-encoder/references/hf_jobs_execution.md | Guidance for running these scripts on Hugging Face Jobs. |
| skills/train-cross-encoder/references/losses.md | Reference guide for cross-encoder losses and when to use them. |
| skills/train-cross-encoder/references/prompts_and_instructions.md | Guidance for prompts/instructions usage during training. |
| skills/train-cross-encoder/references/training_args.md | Reference for training arguments and recommended settings. |
| skills/train-cross-encoder/references/troubleshooting.md | Troubleshooting guide for common training/runtime issues. |
| skills/train-sentence-transformer/SKILL.md | Skill definition and instructions for SentenceTransformer training workflows. |
| skills/train-sentence-transformer/scripts/mine_hard_negatives.py | CLI wrapper for mining hard negatives to support training datasets. |
| skills/train-sentence-transformer/scripts/train_distillation_example.py | SentenceTransformer distillation training template. |
| skills/train-sentence-transformer/scripts/train_example.py | Baseline SentenceTransformer training template. |
| skills/train-sentence-transformer/scripts/train_make_multilingual_example.py | Multilingual training template. |
| skills/train-sentence-transformer/scripts/train_matryoshka_example.py | Matryoshka training template. |
| skills/train-sentence-transformer/scripts/train_multi_dataset_example.py | Multi-dataset training template. |
| skills/train-sentence-transformer/scripts/train_static_embedding_example.py | Static embedding model training template. |
| skills/train-sentence-transformer/scripts/train_with_lora_example.py | LoRA fine-tuning training template. |
| skills/train-sentence-transformer/references/dataset_formats.md | Reference guide for supported dataset shapes/formats. |
| skills/train-sentence-transformer/references/evaluators.md | Reference guide for evaluation options/metrics for bi-encoders. |
| skills/train-sentence-transformer/references/hardware_guide.md | Hardware guidance for running training efficiently. |
| skills/train-sentence-transformer/references/hf_jobs_execution.md | Guidance for running these scripts on Hugging Face Jobs. |
| skills/train-sentence-transformer/references/losses.md | Reference guide for SentenceTransformer losses and when to use them. |
| skills/train-sentence-transformer/references/model_architectures.md | Reference guide for SentenceTransformer model architectures. |
| skills/train-sentence-transformer/references/prompts_and_instructions.md | Guidance for prompts/instructions usage during training. |
| skills/train-sentence-transformer/references/training_args.md | Reference for training arguments and recommended settings. |
| skills/train-sentence-transformer/references/troubleshooting.md | Troubleshooting guide for common training/runtime issues. |
| skills/train-sparse-encoder/SKILL.md | Skill definition and instructions for SPLADE/sparse-encoder training workflows. |
| skills/train-sparse-encoder/scripts/mine_hard_negatives.py | CLI wrapper for mining hard negatives to support training datasets. |
| skills/train-sparse-encoder/scripts/train_distillation_example.py | Sparse-encoder distillation training template. |
| skills/train-sparse-encoder/scripts/train_example.py | Sparse-encoder (SPLADE) contrastive training template. |
| skills/train-sparse-encoder/references/dataset_formats.md | Reference guide for supported dataset shapes/formats. |
| skills/train-sparse-encoder/references/evaluators.md | Reference guide for sparse evaluation options/metrics. |
| skills/train-sparse-encoder/references/hardware_guide.md | Hardware guidance for running training efficiently. |
| skills/train-sparse-encoder/references/hf_jobs_execution.md | Guidance for running these scripts on Hugging Face Jobs. |
| skills/train-sparse-encoder/references/losses.md | Reference guide for sparse-encoder losses and when to use them. |
| skills/train-sparse-encoder/references/prompts_and_instructions.md | Guidance for prompts/instructions usage during training. |
| skills/train-sparse-encoder/references/training_args.md | Reference for training arguments and recommended settings. |
| skills/train-sparse-encoder/references/troubleshooting.md | Troubleshooting guide for common training/runtime issues. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
train-sentence-transformer, train-cross-encoder, and train-sparse-encodertrain-sentence-transformer
train-sentence-transformertrain-sentence-transformers
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Hello!
Pull Request overview
train-sentence-transformersHugging Face Agent Skill underskills/, covering all three sentence-transformers architectures (bi-encoder, cross-encoder, SPLADE) so users can drive end-to-end training runs from any compatible coding agent.github/workflows/sync-skills.ymlto mirror the canonical skill to thehuggingface/skillsmarketplace on eachv*tagDetails
The skill follows the Agent Skills format (
SKILL.md+references/+scripts/), so it's tool-neutral: once published, users install viahf skills add train-sentence-transformers,/plugin install train-sentence-transformers@huggingface/skills(Claude Code), or the auto-published Cursor / Codex / Gemini variants.skills/README.mddocuments both paths plus a local-development recipe for contributors who want to symlink the skill folder into their agent's standard install location for instant edit-loop iteration (junctions viamklink /Jon Windows, since those don't require Developer Mode or admin)..gitignorepicks up.claude/and.agents/so those local symlinks stay untracked.SKILL.mdis a router rather than a manual: it identifies the model type ([SentenceTransformer]/[CrossEncoder]/[SparseEncoder]via tiebreaker rules) and points at the per-type required reading. Per-type loss / evaluator catalogs (references/losses_<type>.md,references/evaluators_<type>.md) and production templates (scripts/train_<type>_example.py) sit alongside cross-cutting refs (training_args.md,dataset_formats.md,troubleshooting.md,base_model_selection.md, plus opt-inmodel_architectures.md,hardware_guide.md,hf_jobs_execution.md,prompts_and_instructions.md). Variant templates cover Matryoshka, multi-dataset, LoRA, distillation, multilingual, static embedding, listwise CE, and SPLADE distillation.scripts/mine_hard_negatives.pyships as a CLI for the cross-cutting hard-negative mining step..github/workflows/sync-skills.ymlis modelled on huggingface_hub's sync-hf-cli-skill.yml. It fires onv*tags (excluding RCs) and on manualworkflow_dispatch, checks outhuggingface/skillsusing the same GitHub App credentials as the hub-cli workflow (reachable at thehuggingfaceorg level since the repo moved here), copiesskills/train-sentence-transformers/into the receiving repo, runs that repo's./scripts/publish.shto regenerate the cross-tool manifests, and opens a PR.marketplace.jsonentries are hand-maintained on the receiving end, so first publication needs a one-time manual PR adding the folder and its entry; the workflow takes over for subsequent content updates.workflow_dispatchis the manual escape hatch for skill-only fixes between releases.The same PR drops
--difffrom the typos pre-commit hook. In--diffmode, typos silently exits non-zero with no output when a typo has multiple suggested corrections (e.g. ambiguous prefixes), which made failures debug-hostile. The default error format givesfile:line:col+ the suggestion inline, which pre-commit displays correctly.