Skip to content

Remove unsupported and confusing "OpenLlama" architecture#24913

Closed
tomaarsen wants to merge 1 commit intohuggingface:mainfrom
tomaarsen:remove_unsupported_open_llama
Closed

Remove unsupported and confusing "OpenLlama" architecture#24913
tomaarsen wants to merge 1 commit intohuggingface:mainfrom
tomaarsen:remove_unsupported_open_llama

Conversation

@tomaarsen
Copy link
Member

This reverts commit c2c99dc from #22795.

Hello!

What does this PR do?

Removes Open-Llama, a confusingly named and unused model architecture from which all documentation links throw 404 errors.

Motivation

  • The Open-Llama-V1 model and Open-Llama GitHub repository have both been removed.
  • There is only one issue on the transformers repo that refers to OpenLlamaForCausalLM.
  • There is only one model on the Hub that mentions OpenLlamaForCausalLM.
  • All of the ~1900 other LLama models use the standard LlamaForCausalLM instead.
  • Users on the HF Discord have been confused by the Open-Llama documentation page.

In case you are opposed out of principle, i.e. that no architectures should be removed, then we may want to at least update the documentation. However, I urge you to remove this confusing architecture.

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you read the contributor guideline,
    Pull Request section?
  • Was this discussed/approved via a Github issue or the forum? Please add a link
    to it if that's the case.
  • Did you make sure to update the documentation with your changes? Here are the
    documentation guidelines, and
    here are tips on formatting docstrings.
  • Did you write any new necessary tests?

Who can review?

Reviewers of the original PR: @ArthurZucker @sgugger
cc: @s-JoL

  • Tom Aarsen

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint.

@sgugger
Copy link
Collaborator

sgugger commented Jul 19, 2023

No we cannot removing the architecture entirely that would be a breaking change in the library that would be unprecedented and against our philosphy/commitment to backward compatibility.

Even removing it entirely from the docs seem a bit extreme. If users are confused by which class to use to load a model, they should use the Auto model API to let it select the right class for them. We can add a disclaimer in the doc page of open-llama doc or move it to the deprecated models but that's pretty much the extent of what we can do.

cc @LysandreJik

@tomaarsen
Copy link
Member Author

I had such a fear - I understand your reasoning completely. Whichever way we go, we end up with a suboptimal situation. I'll let the user on Discord know that it won't be changed.

I could add a disclaimer that the OpenLlama models do not use the OpenLlama architecture, but simply the Llama one.

@tomaarsen tomaarsen closed this Jul 19, 2023
@tomaarsen tomaarsen deleted the remove_unsupported_open_llama branch July 19, 2023 11:57
@sgugger
Copy link
Collaborator

sgugger commented Jul 19, 2023

I could add a disclaimer that the OpenLlama models do not use the OpenLlama architecture, but simply the Llama one.

Yes, that would be great!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants