Skip to content

Internalise the NomicBERT model#43067

Merged
vasqu merged 108 commits into
huggingface:mainfrom
ed22699:bert-rope-model
Apr 2, 2026
Merged

Internalise the NomicBERT model#43067
vasqu merged 108 commits into
huggingface:mainfrom
ed22699:bert-rope-model

Conversation

@ed22699

@ed22699 ed22699 commented Dec 29, 2025

Copy link
Copy Markdown
Contributor

What does this PR do?

This PR internalises the NomicBERT model, following the basic structure of the https://huggingface.co/nomic-ai/nomic-bert-2048

Fixes #42738

Problem

BERT-like models using RoPE are currently not internalized in our codebase, e.g. https://huggingface.co/nomic-ai/nomic-bert-2048

Solution

This PR creates a basic internalized version of nomic-bert-2048 with required modifications.

  • Modular file: modular_nomic_bert.py implemented and verified with python utils/modular_model_converter.py modular_nomic_bert.py
  • Conversion script: convert_nomic_bert_to_hf.py added with usage examples
  • Integration tests: End-to-end tests with exact output matching (text or logits)
  • Documentation: Model docs added/updated in docs/source/en/model_doc/
  • Pattern reuse: Verified against similar models (LLaVA, Idefics2, etc.)
  • Quality checks: make fixup passes with no errors

Who Can Review?

@ArthurZucker @Cyrilvallez (text models)

ed22699 and others added 12 commits December 27, 2025 18:06
Co-authored-by: Felix Arkle <felixarkle@icloud.com>
Implemented descriptions for the main nomic bert documentation and
debugged modular_nomic_bert
Co-authored-by: Felix Arkle <felixarkle@icloud.com>
Add einops to setup and add availibility checks for more graceful exit
if not available
previous version overrote bert, leading to forward_unimplemented
Remove code which broke the encoder only assumption
@github-actions

github-actions Bot commented Apr 1, 2026

Copy link
Copy Markdown
Contributor

[For maintainers] Suggested jobs to run (before merge)

run-slow: auto, jina_embeddings_v3, nomic_bert

1 similar comment
@github-actions

github-actions Bot commented Apr 1, 2026

Copy link
Copy Markdown
Contributor

[For maintainers] Suggested jobs to run (before merge)

run-slow: auto, jina_embeddings_v3, nomic_bert

@vasqu

vasqu commented Apr 1, 2026

Copy link
Copy Markdown
Contributor

run-slow: jina_embeddings_v3, nomic_bert

@github-actions

github-actions Bot commented Apr 1, 2026

Copy link
Copy Markdown
Contributor

Workflow Run ⚙️

This comment contains run-slow, running the specified jobs:

models: ["models/jina_embeddings_v3", "models/nomic_bert"]
quantizations: []

@github-actions

github-actions Bot commented Apr 1, 2026

Copy link
Copy Markdown
Contributor

View the CircleCI Test Summary for this PR:

https://huggingface.co/spaces/transformers-community/circle-ci-viz?pr=43067&sha=da43bf

@github-actions

github-actions Bot commented Apr 1, 2026

Copy link
Copy Markdown
Contributor

CI Results

Workflow Run ⚙️

Commit Info

Context Commit Description
RUN ed2325fb workflow commit (merge commit)
PR da43bf34 branch commit (from PR)
main f38d6639 base commit (on main)

✅ No failing test specific to this PR 🎉 👏 !

@tomaarsen tomaarsen left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding is that this incorporates only the non-MoE path? The https://huggingface.co/nomic-ai/nomic-bert-2048/blob/main/modeling_hf_nomic_bert.py modeling code is used for various models, including:

These vision models:

And these research checkpoints:

I assume that this work is only aiming for the text portion. That does mean that we're diverging from the original implementation a bit, which also supports vision and MoE. Not strictly an issue, just something to note.
If we move forward, let's try to support not just the v1.5, but also the v1, it's also getting used a lot.

Comment thread docs/source/en/model_doc/nomic_bert.md Outdated

## Overview

The NomicBERT model currently has no academic papers specifically written about it, however, the [nomic-embed-text-v1.5](https://huggingface.co/nomic-ai/nomic-embed-text-v1.5) card clearly describes the model’s architecture and training approach: it extends BERT to a 2048 token context length, and modifies the BERT training procedure. Notable changes include:

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated the docs 🫡

@github-actions

github-actions Bot commented Apr 2, 2026

Copy link
Copy Markdown
Contributor

[For maintainers] Suggested jobs to run (before merge)

run-slow: auto, jina_embeddings_v3, nomic_bert

@github-actions

github-actions Bot commented Apr 2, 2026

Copy link
Copy Markdown
Contributor

[For maintainers] Suggested jobs to run (before merge)

run-slow: auto, jina_embeddings_v3, nomic_bert

@github-actions

github-actions Bot commented Apr 2, 2026

Copy link
Copy Markdown
Contributor

[For maintainers] Suggested jobs to run (before merge)

run-slow: auto, jina_embeddings_v3, nomic_bert

@vasqu

vasqu commented Apr 2, 2026

Copy link
Copy Markdown
Contributor

run-slow: jina_embeddings_v3, nomic_bert

@github-actions

github-actions Bot commented Apr 2, 2026

Copy link
Copy Markdown
Contributor

Workflow Run ⚙️

This comment contains run-slow, running the specified jobs:

models: ["models/jina_embeddings_v3", "models/nomic_bert"]
quantizations: []

@github-actions

github-actions Bot commented Apr 2, 2026

Copy link
Copy Markdown
Contributor

[For maintainers] Suggested jobs to run (before merge)

run-slow: auto, jina_embeddings_v3, nomic_bert

@github-actions

github-actions Bot commented Apr 2, 2026

Copy link
Copy Markdown
Contributor

CI Results

Workflow Run ⚙️

Commit Info

Context Commit Description
RUN 59f0c24d workflow commit (merge commit)
PR 0b61950f branch commit (from PR)
main abc417a4 base commit (on main)

⚠️ Model CI failed to report results

The test failure analysis could not be completed. Please check the workflow run for details.

@vasqu

vasqu commented Apr 2, 2026

Copy link
Copy Markdown
Contributor

run-slow: jina_embeddings_v3, nomic_bert

@github-actions

github-actions Bot commented Apr 2, 2026

Copy link
Copy Markdown
Contributor

Workflow Run ⚙️

This comment contains run-slow, running the specified jobs:

models: ["models/jina_embeddings_v3", "models/nomic_bert"]
quantizations: []

@github-actions

github-actions Bot commented Apr 2, 2026

Copy link
Copy Markdown
Contributor

CI Results

Workflow Run ⚙️

Commit Info

Context Commit Description
RUN 02e17259 workflow commit (merge commit)
PR c27a3aa7 branch commit (from PR)
main abc417a4 base commit (on main)

✅ No failing test specific to this PR 🎉 👏 !

@github-actions

github-actions Bot commented Apr 2, 2026

Copy link
Copy Markdown
Contributor

[For maintainers] Suggested jobs to run (before merge)

run-slow: auto, jina_embeddings_v3, nomic_bert

@tomaarsen tomaarsen left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small nits, the general gist is solid I think.

Comment thread docs/source/en/model_doc/nomic_bert.md Outdated
Comment thread docs/source/en/model_doc/nomic_bert.md Outdated
Comment thread docs/source/en/model_doc/nomic_bert.md Outdated
@github-actions

github-actions Bot commented Apr 2, 2026

Copy link
Copy Markdown
Contributor

[For maintainers] Suggested jobs to run (before merge)

run-slow: auto, jina_embeddings_v3, nomic_bert

@vasqu

vasqu commented Apr 2, 2026

Copy link
Copy Markdown
Contributor

hub has problems and the other test is unrelated, merging

@github-actions

github-actions Bot commented Apr 2, 2026

Copy link
Copy Markdown
Contributor

View the CircleCI Test Summary for this PR:

https://huggingface.co/spaces/transformers-community/circle-ci-viz?pr=43067&sha=67fbce

@vasqu

vasqu commented Apr 2, 2026

Copy link
Copy Markdown
Contributor

Thanks a lot to everyone involved @ed22699 @tomaarsen 🤗

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

BERT-like models with RoPE

6 participants