fix config by ncfrey · Pull Request #178 · prescient-design/lobster

ncfrey · 2025-08-01T16:15:36Z

Description

This pull request includes a configuration update for the UME_large model in the ModernBERT module. The changes adjust several parameters to better align with the intended architecture.

Configuration updates for `UME_large` model:

src/lobster/model/modern_bert/_modern_bert_configuration.py: Updated UME_large model parameters:
- Reduced num_attention_heads from 25 to 24.
- Increased intermediate_size from 6400 to 6912.
- Increased hidden_size from 1600 to 1728.

Type of Change

Copilot

Pull Request Overview

This pull request fixes configuration parameters for the UME_large model in the ModernBERT module to align with the intended architecture specifications.

Key changes:

Corrected attention head count to be divisible by the hidden size
Adjusted intermediate and hidden sizes to match proper model dimensions

Copilot · 2025-08-01T16:16:20Z

src/lobster/model/modern_bert/_modern_bert_configuration.py

-        "num_attention_heads": 25,
-        "intermediate_size": 6400,
-        "hidden_size": 1600,
+        "num_attention_heads": 24,


The change from 25 to 24 attention heads ensures the hidden_size (1728) is evenly divisible by num_attention_heads (24), which is required for multi-head attention to work correctly. This fixes a potential runtime error where head_dim would not be an integer.

fix config

4225fa0

ncfrey requested review from Copilot and karinazad August 1, 2025 16:16

Copilot AI reviewed Aug 1, 2025

View reviewed changes

taylormjs approved these changes Aug 1, 2025

View reviewed changes

taylormjs merged commit 28af2ce into main Aug 1, 2025
4 checks passed

taylormjs deleted the n/fix-large-onnx branch August 1, 2025 19:57

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix config#178

fix config#178
taylormjs merged 1 commit intomainfrom
n/fix-large-onnx

ncfrey commented Aug 1, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Aug 1, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

ncfrey commented Aug 1, 2025

Description

Configuration updates for UME_large model:

Type of Change

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Uh oh!

Copilot AI Aug 1, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Configuration updates for `UME_large` model: