Skip to content

Ume embedding normalization + small fixes#115

Merged
karinazad merged 22 commits intomainfrom
k/ume-optimizer
Jun 23, 2025
Merged

Ume embedding normalization + small fixes#115
karinazad merged 22 commits intomainfrom
k/ume-optimizer

Conversation

@karinazad
Copy link
Collaborator

@karinazad karinazad commented Jun 19, 2025

PR #115: UME Embedding Normalization + Small Fixes

Summary

This PR includes several minor fixes and improvements. The main change is the addition of L2 normalization to embeddings in the contrastive learning pipeline, specifically in the Symile loss computation.

Key Changes

1. Embedding Normalization

  • Location: src/lobster/model/_ume.py line 651
  • Change: Added L2 normalization to embeddings in the _compute_symile_loss method:
    embeddings = [torch.nn.functional.normalize(embedding, dim=-1) for embedding in embeddings]
  • Purpose: Ensures embeddings have unit norm before computing contrastive loss, which is a standard practice in contrastive learning to improve training stability and performance

2. Rename Ume to UME

  • Scope: Multiple files across the codebase

3. Dependency Updates

  • File: pyproject.toml
  • Changes:
    • Updated flash-attn from >=2.7.4.post1 to >=2.8.0.post2
    • Updated PyTorch constraint from ==2.6.0 to ==2.7.0 for flash extra

@karinazad karinazad changed the title Ume small fixes Ume embedding normalization + small fixes Jun 19, 2025
@ncfrey ncfrey requested a review from Copilot June 23, 2025 18:20
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR applies three main changes: it adds L2 normalization to embeddings before computing the Symile loss, renames Ume to UME across the codebase, and updates several dependencies.

  • Embedding normalization is added in the Symile loss computation to ensure unit norm, improving contrastive learning stability.
  • Renaming of Ume to UME is applied uniformly in tests, model, tokenizers, hydra configs, and example scripts.
  • Dependency updates adjust the versions for flash-attn and PyTorch to the latest supported releases.

Reviewed Changes

Copilot reviewed 36 out of 37 changed files in this pull request and generated no comments.

Show a summary per file
File Description
tests/**/*.py Updated UME naming in tokenizer and model usage tests.
src/lobster/model/_ume.py Added normalization to embeddings and renamed Ume to UME.
src/lobster/model/losses/_symile_loss.py Updated initialization of SymileLoss to use a logit scale instead of temperature.
src/lobster/server/_server.py Renamed UmeServer to UMEServer and corresponding type hint updates.
src/lobster/hydra_config/**/*.yaml Updated configuration targets and dependency settings.
notebooks//*.ipynb & examples//*.py Renamed occurrences of Ume to UME for consistency.
pyproject.toml Updated flash-attn and PyTorch version constraints.

@karinazad karinazad merged commit e6f3f9c into main Jun 23, 2025
5 checks passed
@karinazad karinazad deleted the k/ume-optimizer branch June 23, 2025 18:44
ncfrey pushed a commit that referenced this pull request Jun 24, 2025
* add optimizer

* scheduler args

* normalize embeddings before contrastive losses

* tests

* ume move embeddings funcs out

* simplify contrastive step

* streaming dataset remove transform

* fix symile loss, bump torch to 2.7

* fix infonce loss

* remove logit scale parameter

* remove symile to device placement

* remove symile to device placement

* batch size finder callback

* debug

* pplx logging by modality

* tests

* replace Ume with UME :'(

* replace Ume with UME :'(

* remove examples

* remove debug nteobok
ncfrey pushed a commit that referenced this pull request Jun 24, 2025
* add optimizer

* scheduler args

* normalize embeddings before contrastive losses

* tests

* ume move embeddings funcs out

* simplify contrastive step

* streaming dataset remove transform

* fix symile loss, bump torch to 2.7

* fix infonce loss

* remove logit scale parameter

* remove symile to device placement

* remove symile to device placement

* batch size finder callback

* debug

* pplx logging by modality

* tests

* replace Ume with UME :'(

* replace Ume with UME :'(

* remove examples

* remove debug nteobok
ncfrey pushed a commit that referenced this pull request Jun 24, 2025
* add optimizer

* scheduler args

* normalize embeddings before contrastive losses

* tests

* ume move embeddings funcs out

* simplify contrastive step

* streaming dataset remove transform

* fix symile loss, bump torch to 2.7

* fix infonce loss

* remove logit scale parameter

* remove symile to device placement

* remove symile to device placement

* batch size finder callback

* debug

* pplx logging by modality

* tests

* replace Ume with UME :'(

* replace Ume with UME :'(

* remove examples

* remove debug nteobok
ncfrey pushed a commit that referenced this pull request Jun 24, 2025
* add optimizer

* scheduler args

* normalize embeddings before contrastive losses

* tests

* ume move embeddings funcs out

* simplify contrastive step

* streaming dataset remove transform

* fix symile loss, bump torch to 2.7

* fix infonce loss

* remove logit scale parameter

* remove symile to device placement

* remove symile to device placement

* batch size finder callback

* debug

* pplx logging by modality

* tests

* replace Ume with UME :'(

* replace Ume with UME :'(

* remove examples

* remove debug nteobok
ncfrey pushed a commit that referenced this pull request Jun 24, 2025
* add optimizer

* scheduler args

* normalize embeddings before contrastive losses

* tests

* ume move embeddings funcs out

* simplify contrastive step

* streaming dataset remove transform

* fix symile loss, bump torch to 2.7

* fix infonce loss

* remove logit scale parameter

* remove symile to device placement

* remove symile to device placement

* batch size finder callback

* debug

* pplx logging by modality

* tests

* replace Ume with UME :'(

* replace Ume with UME :'(

* remove examples

* remove debug nteobok
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants