Skip to content

Ume weighted contrastive loss#79

Merged
karinazad merged 3 commits intomainfrom
ume-weighted-contrastive-loss
May 19, 2025
Merged

Ume weighted contrastive loss#79
karinazad merged 3 commits intomainfrom
ume-weighted-contrastive-loss

Conversation

@karinazad
Copy link
Collaborator

@karinazad karinazad commented May 16, 2025

Add an option to run InfoNCE and MLM at the same time.

  • if there are two items in the batch, we run MLM on the first item and InfoNCE on the second item
  • the first item is the original modality, and the second item is the converted modality
  • run MLM only on the first one since that's the more 'normal' sample we'd see in standard training

(1 - self.contrastive_loss_weight) * mlm_loss + self.contrastive_loss_weight * contrastive_loss

@karinazad karinazad requested review from ncfrey and sjmielke May 16, 2025 18:42
Copy link
Collaborator

@sjmielke sjmielke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small suggestions, though i'd also be curious whether the decision to couple MLM and InfoNCE loss like this will work out well with different batch size requirements for either (CLIP uses like what 32k batch size I think for the contrastive loss, so I don't know if that's also the range we're thinking of for these combined batches---and of course, all of this is assuming we can fit that much in memory without tiling the similarity matrix computation... maybe it's fine for now just to see if it kinda works :) )

@karinazad karinazad merged commit 26613c7 into main May 19, 2025
5 checks passed
@karinazad karinazad deleted the ume-weighted-contrastive-loss branch May 19, 2025 14:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants