Ume embedding normalization + small fixes#115
Merged
Conversation
Contributor
There was a problem hiding this comment.
Pull Request Overview
This PR applies three main changes: it adds L2 normalization to embeddings before computing the Symile loss, renames Ume to UME across the codebase, and updates several dependencies.
- Embedding normalization is added in the Symile loss computation to ensure unit norm, improving contrastive learning stability.
- Renaming of Ume to UME is applied uniformly in tests, model, tokenizers, hydra configs, and example scripts.
- Dependency updates adjust the versions for flash-attn and PyTorch to the latest supported releases.
Reviewed Changes
Copilot reviewed 36 out of 37 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
| tests/**/*.py | Updated UME naming in tokenizer and model usage tests. |
| src/lobster/model/_ume.py | Added normalization to embeddings and renamed Ume to UME. |
| src/lobster/model/losses/_symile_loss.py | Updated initialization of SymileLoss to use a logit scale instead of temperature. |
| src/lobster/server/_server.py | Renamed UmeServer to UMEServer and corresponding type hint updates. |
| src/lobster/hydra_config/**/*.yaml | Updated configuration targets and dependency settings. |
| notebooks//*.ipynb & examples//*.py | Renamed occurrences of Ume to UME for consistency. |
| pyproject.toml | Updated flash-attn and PyTorch version constraints. |
ncfrey
approved these changes
Jun 23, 2025
…to k/ume-optimizer
ncfrey
pushed a commit
that referenced
this pull request
Jun 24, 2025
* add optimizer * scheduler args * normalize embeddings before contrastive losses * tests * ume move embeddings funcs out * simplify contrastive step * streaming dataset remove transform * fix symile loss, bump torch to 2.7 * fix infonce loss * remove logit scale parameter * remove symile to device placement * remove symile to device placement * batch size finder callback * debug * pplx logging by modality * tests * replace Ume with UME :'( * replace Ume with UME :'( * remove examples * remove debug nteobok
ncfrey
pushed a commit
that referenced
this pull request
Jun 24, 2025
* add optimizer * scheduler args * normalize embeddings before contrastive losses * tests * ume move embeddings funcs out * simplify contrastive step * streaming dataset remove transform * fix symile loss, bump torch to 2.7 * fix infonce loss * remove logit scale parameter * remove symile to device placement * remove symile to device placement * batch size finder callback * debug * pplx logging by modality * tests * replace Ume with UME :'( * replace Ume with UME :'( * remove examples * remove debug nteobok
ncfrey
pushed a commit
that referenced
this pull request
Jun 24, 2025
* add optimizer * scheduler args * normalize embeddings before contrastive losses * tests * ume move embeddings funcs out * simplify contrastive step * streaming dataset remove transform * fix symile loss, bump torch to 2.7 * fix infonce loss * remove logit scale parameter * remove symile to device placement * remove symile to device placement * batch size finder callback * debug * pplx logging by modality * tests * replace Ume with UME :'( * replace Ume with UME :'( * remove examples * remove debug nteobok
ncfrey
pushed a commit
that referenced
this pull request
Jun 24, 2025
* add optimizer * scheduler args * normalize embeddings before contrastive losses * tests * ume move embeddings funcs out * simplify contrastive step * streaming dataset remove transform * fix symile loss, bump torch to 2.7 * fix infonce loss * remove logit scale parameter * remove symile to device placement * remove symile to device placement * batch size finder callback * debug * pplx logging by modality * tests * replace Ume with UME :'( * replace Ume with UME :'( * remove examples * remove debug nteobok
ncfrey
pushed a commit
that referenced
this pull request
Jun 24, 2025
* add optimizer * scheduler args * normalize embeddings before contrastive losses * tests * ume move embeddings funcs out * simplify contrastive step * streaming dataset remove transform * fix symile loss, bump torch to 2.7 * fix infonce loss * remove logit scale parameter * remove symile to device placement * remove symile to device placement * batch size finder callback * debug * pplx logging by modality * tests * replace Ume with UME :'( * replace Ume with UME :'( * remove examples * remove debug nteobok
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
PR #115: UME Embedding Normalization + Small Fixes
Summary
This PR includes several minor fixes and improvements. The main change is the addition of L2 normalization to embeddings in the contrastive learning pipeline, specifically in the Symile loss computation.
Key Changes
1. Embedding Normalization
src/lobster/model/_ume.pyline 651_compute_symile_lossmethod:2. Rename Ume to UME
3. Dependency Updates
pyproject.tomlflash-attnfrom>=2.7.4.post1to>=2.8.0.post2==2.6.0to==2.7.0for flash extra