-
Notifications
You must be signed in to change notification settings - Fork 32.5k
Closed
Closed
Copy link
Description
Environment info
transformersversion: 4.17.0.dev0- Platform: Linux-5.13.0-27-generic-x86_64-with-glibc2.34
- Python version: 3.9.7
- PyTorch version (GPU?): 1.10.2+cu113 (True)
- Tensorflow version (GPU?): 2.7.0 (False)
- Flax version (CPU?/GPU?/TPU?): not installed (NA)
- Jax version: not installed
- JaxLib version: not installed
- Using GPU in script?: N/A
- Using distributed or parallel set-up in script?: N/A
Who can help
Information
Model I am using: Wav2Vec2 with KenLM
The problem arises when using:
- the official example scripts: Any script using the ASR pipeline trying to load from a local directory a Wav2Vec2 model with a language model attached, as in for example eval.py
- my own modified scripts
The tasks I am working on is:
- an official GLUE/SQUaD task:
robust-speech-event - my own task or dataset
To reproduce
Steps to reproduce the behavior:
- Download
eval.pyscript - Clone a model repo that contains a language model
- Run the script with the model in a local directory
- It tries to download the model from the hub even though it should load locally
$ git clone https://huggingface.co/NbAiLab/wav2vec2-xls-r-1b-npsc-bokmaal-low-27k
$ cd wav2vec2-xls-r-1b-npsc-bokmaal-low-27k
$ python eval.py --model_id ./ --dataset NbAiLab/NPSC --config 16K_mp3_bokmaal --split test --log_outputs
Reusing dataset npsc (/home/user/.cache/huggingface/datasets/NbAiLab___npsc/16K_mp3_bokmaal/1.0.0/fab8b0517ebc9c0c6f0d019094e8816d5537f55d965f2dd90750349017b0bc69)
Traceback (most recent call last):
File "/home/user/wav2vec2-xls-r-1b-npsc-bokmaal-low-27k/eval.py", line 151, in <module>
main(args)
File "/home/user/wav2vec2-xls-r-1b-npsc-bokmaal-low-27k/eval.py", line 98, in main
asr = pipeline("automatic-speech-recognition", model=args.model_id, device=args.device)
File "/home/user/audio/lib/python3.9/site-packages/transformers/pipelines/__init__.py", line 628, in pipeline
decoder = BeamSearchDecoderCTC.load_from_hf_hub(model_name, allow_regex=allow_regex)
File "/home/user/audio/lib/python3.9/site-packages/pyctcdecode/decoder.py", line 771, in load_from_hf_hub
cached_directory = snapshot_download(model_id, cache_dir=cache_dir, **kwargs)
File "/home/user/audio/lib/python3.9/site-packages/huggingface_hub/snapshot_download.py", line 144, in snapshot_download
model_info = _api.model_info(repo_id=repo_id, revision=revision, token=token)
File "/home/user/audio/lib/python3.9/site-packages/huggingface_hub/hf_api.py", line 912, in model_info
r.raise_for_status()
File "/home/user/audio/lib/python3.9/site-packages/requests/models.py", line 960, in raise_for_status
raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 404 Client Error: Not Found for url: https://huggingface.co/api/models//revision/main Expected behavior
It should not try to download anything when the model is a path to a local directory.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels