Merged
Conversation
Contributor
There was a problem hiding this comment.
Pull Request Overview
This PR fixes an invalid default for diff_channel_1 and ensures embeddings are moved to the correct device before training, preventing tensor placement errors.
- Changed
diff_channel_1default from"sub"to"diff"for consistency with_resize_embeddings - Added
.to(self.device)when concatenating embeddings to avoid CPU/CUDA mismatch
Comments suppressed due to low confidence (2)
src/lobster/model/_dyab.py:222
- Transferring each tensor individually inside the list comprehension can incur multiple device-to-device copies. Consider concatenating on CPU first and then moving the final tensor to the target device with a single
.to(self.device)call.
embeddings1 = torch.concat([this.embedding_cache[seq].to(self.device) for seq in sequences1], dim=0)
src/lobster/model/_dyab.py:223
- Similar to embeddings1, performing
.to(self.device)on each tensor can be inefficient. It’s better to concat on CPU then call.to(self.device)once on the result.
embeddings2 = torch.concat([this.embedding_cache[seq].to(self.device) for seq in sequences2], dim=0)
ncfrey
approved these changes
Jun 13, 2025
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Some small modifications to the DyAb class
_resize_embeddingsto(self.device)to embeddings, so that the model can be trained on a CUDA device. In my testing on pcluster, without this, there exists tensors on both CPU and CUDA, which cuases an exception to be raised.