Update transformers and sentence transformers docs#18925
Update transformers and sentence transformers docs#18925BenWilson2 merged 6 commits intomlflow:masterfrom
Conversation
Signed-off-by: Ben Wilson <benjamin.wilson@databricks.com>
|
Documentation preview for 1136158 is available at: Changed Pages (5)
More info
|
There was a problem hiding this comment.
Pull Request Overview
This pull request standardizes and cleans up the Transformers and Sentence Transformers documentation by removing marketing language, consolidating pages, and providing more focused, practical examples.
Key Changes
- Simplified navigation structure by consolidating Sentence Transformers guides and tutorials into a single page
- Rewrote documentation to be more concise and focused on practical usage
- Removed version-specific notes that are no longer relevant
- Updated redirects to maintain backward compatibility with old URLs
Reviewed Changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| docs/sidebarsClassicML.ts | Simplified Sentence Transformers navigation from category to single doc |
| docs/docusaurus.config.ts | Updated redirects for consolidated Sentence Transformers pages |
| docs/docs/classic-ml/deep-learning/transformers/index.mdx | Cleaned up marketing language, added structured sections with clearer examples |
| docs/docs/classic-ml/deep-learning/transformers/guide/index.mdx | Removed outdated version-specific notes |
| docs/docs/classic-ml/deep-learning/sentence-transformers/index.mdx | Complete rewrite with practical examples and consolidated content |
| docs/docs/classic-ml/deep-learning/sentence-transformers/guide/index.mdx | Removed (content consolidated into main page) |
| docs/docs/classic-ml/deep-learning/sentence-transformers/tutorials/index.mdx | Removed (content consolidated into main page) |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| - 🔍 Build advanced semantic search systems with re-ranking and relevance optimization | ||
| - 📦 Create custom model flavors for specialized architectures and deployment requirements | ||
| - 🎨 Develop comprehensive evaluation frameworks for semantic quality assessment | ||
| ```python |
There was a problem hiding this comment.
Missing import statements. The code block uses mlflow and SentenceTransformer but doesn't import them. For a standalone example, it should include: import mlflow and from sentence_transformers import SentenceTransformer before the import time statement.
| ```python | |
| ```python | |
| import mlflow | |
| from sentence_transformers import SentenceTransformer |
| ```python | ||
| # Fine-tune for your domain | ||
| from sentence_transformers import SentenceTransformer, InputExample, losses | ||
| from sentence_transformers import InputExample, losses |
There was a problem hiding this comment.
Missing import statements. The code block uses mlflow and SentenceTransformer but doesn't import them. For a standalone example, it should include: import mlflow and from sentence_transformers import SentenceTransformer.
| from sentence_transformers import InputExample, losses | |
| import mlflow | |
| from sentence_transformers import SentenceTransformer, InputExample, losses |
| ```python | ||
| models_to_compare = [ | ||
| "all-MiniLM-L6-v2", | ||
| "all-mpnet-base-v2", | ||
| "paraphrase-albert-small-v2", | ||
| ] |
There was a problem hiding this comment.
Missing import statements. The code block uses mlflow, SentenceTransformer, pd (pandas), and time but doesn't import them. For a standalone example, it should include all necessary imports: import mlflow, import pandas as pd, import time, and from sentence_transformers import SentenceTransformer.
Signed-off-by: Ben Wilson <benjamin.wilson@databricks.com>
Signed-off-by: Ben Wilson <benjamin.wilson@databricks.com>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 7 out of 9 changed files in this pull request and generated no new comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review for a chance to win a $100 gift card. Take the survey.
| icon: Zap, | ||
| title: "Autologging", | ||
| description: "Automatic tracking of training metrics and parameters", | ||
| }, |
There was a problem hiding this comment.
I think we don't support autologging?
Perhaps our trainer logger for transformers may work, but then can we have a guide for that in that page.
| - 🧬 **Scientific Literature Analysis**: Analyze research papers, patents, and technical documents for similarity, trends, and knowledge gaps | ||
|
|
||
| ## Complete Learning Journey | ||
| ## Batch Processing |
There was a problem hiding this comment.
Do we need this example? This doesn't seem to be using any special mlflow feature.
| ``` | ||
|
|
||
| </details> | ||
| ## Model Comparison |
There was a problem hiding this comment.
If we want to show comparison use case, I guess we should use eval? (or maybe can just remove)
There was a problem hiding this comment.
Yeah I'm just going to remove it.
| @@ -202,10 +202,6 @@ print(response) | |||
|
|
|||
| ## Saving Prompt Templates with Transformer Pipelines | |||
There was a problem hiding this comment.
Can we actually remove this guide? We should now recommend prompt registry rather than logging it directly in the model.
| <CardGroup> | ||
| <PageCard | ||
| headerText="Sentence Transformers Quickstart" | ||
| link="/ml/deep-learning/sentence-transformers/tutorials/quickstart/sentence-transformers-quickstart/" |
There was a problem hiding this comment.
Let's add these tutorial pages in the sidebar
| tracking to model deployment. This combination offers a robust and efficient pathway for incorporating advanced NLP and AI capabilities | ||
| into your applications. | ||
| ```bash | ||
| pip install mlflow[transformers] |
There was a problem hiding this comment.
| pip install mlflow[transformers] | |
| pip install mlflow transformers |
There was a problem hiding this comment.
Thoughts on adding a quick section about report_to="mlflow" in the transformers library as autologging feature?
B-Step62
left a comment
There was a problem hiding this comment.
Overall LGTM but left a few comments to be addressed before merging
Signed-off-by: Ben Wilson <benjamin.wilson@databricks.com>
Signed-off-by: Ben Wilson <benjamin.wilson@databricks.com>
🛠 DevTools 🛠
Install mlflow from this PR
For Databricks, use the following command:
Related Issues/PRs
#xxxWhat changes are proposed in this pull request?
Standardize and cleanup junk marketing fluff from transformers / sentence transformers docs.
How is this PR tested?
Does this PR require documentation update?
Release Notes
Is this a user-facing change?
What component(s), interfaces, languages, and integrations does this PR affect?
Components
area/tracking: Tracking Service, tracking client APIs, autologgingarea/models: MLmodel format, model serialization/deserialization, flavorsarea/model-registry: Model Registry service, APIs, and the fluent client calls for Model Registryarea/scoring: MLflow Model server, model deployment tools, Spark UDFsarea/evaluation: MLflow model evaluation features, evaluation metrics, and evaluation workflowsarea/gateway: MLflow AI Gateway client APIs, server, and third-party integrationsarea/prompts: MLflow prompt engineering features, prompt templates, and prompt managementarea/tracing: MLflow Tracing features, tracing APIs, and LLM tracing functionalityarea/projects: MLproject format, project running backendsarea/uiux: Front-end, user experience, plotting, JavaScript, JavaScript dev serverarea/build: Build and test infrastructure for MLflowarea/docs: MLflow documentation pagesHow should the PR be classified in the release notes? Choose one:
rn/none- No description will be included. The PR will be mentioned only by the PR number in the "Small Bugfixes and Documentation Updates" sectionrn/breaking-change- The PR will be mentioned in the "Breaking Changes" sectionrn/feature- A new user-facing feature worth mentioning in the release notesrn/bug-fix- A user-facing bug fix worth mentioning in the release notesrn/documentation- A user-facing documentation change worth mentioning in the release notesShould this PR be included in the next patch release?
Yesshould be selected for bug fixes, documentation updates, and other small changes.Noshould be selected for new features and larger changes. If you're unsure about the release classification of this PR, leave this unchecked to let the maintainers decide.What is a minor/patch release?
Bug fixes, doc updates and new features usually go into minor releases.
Bug fixes and doc updates usually go into patch releases.