Onnx enable tasks for supported models (part 2) by michaelbenayoun · Pull Request #14700 · huggingface/transformers

michaelbenayoun · 2021-12-09T10:09:51Z

What does this PR do?

This PR reapplies the reverted PR #14358, and solves the issues that caused the revert.

What does this PR do?

This PR adds support for almost all the features available for already supported models.

Main contributions:

OnnxSeq2SeqConfigWithPast: a new class inheriting from OnnxConfigWithPast designed specifically for seq2seq models, this should make things easier for the community to contribute.
Tests refactoring and parameterization: now every (model, feature) export pair is tested, and is considered as a standalone test (compared to before when everything was considered to be one big test).
A lot of new features (a feature is a task plus the choice or not to use past_key_values), that have been requested by the community (check the list of supported feautres below)

Features now supported:

For BERT like models: default, sequence-classification, token-classification and question-answering (multiple-choice will be added later).
For causal language models (GPT-2 and GPT-neo): default, default-with-past, causal-lm, causal-lm-with-past, sequence-classification and token-classification (only for GPT2).
For Seq2Seq models (T5, BART, mBART):
- T5, BART, mBART: default, default-with-past, seq2seq-lm, seq2seq-lm-with-past
- BART, mBART: causal-lm, causal-lm-with-past, sequence-classification, question-answering

tests/test_onnx_v2.py

LysandreJik

Hey! Thanks for this, this is impressive work.

I wonder if it would be possible to upstream some of the content written in generate_dummy_inputs in the raw OnnxConfig object? It seems like a lot of the code can be reused among other models.

If it cannot be done, could you mention what are the current blockers so that we may study what needs to be done? For example, a clear separation between what models are encoder-decoders, which need past key value handling, etc.

Overall I'd argue that having very self-contained methods, that don't hop between different files, is a big plus in terms of readability. Having those be very explicit in the parent ONNX configuration and called with explicit method names in the downstream model-specific ONNX configuration would be a huge plus in terms of readability, in my opinion.

src/transformers/models/bart/configuration_bart.py

src/transformers/models/mbart/configuration_mbart.py

lewtun · 2021-12-17T17:17:43Z

Gently pinging @LysandreJik for his blessing on the latest round of changes :)

sorenmc · 2021-12-21T09:18:45Z

Does this ONNX conversion support beam search automatically for BART based summarizers?

lewtun · 2021-12-21T13:15:07Z

Does this ONNX conversion support beam search automatically for BART based summarizers?

Hi @sorenmc, no, you'll have to implement your own .generate() method for the ONNX models. There is a related feature request in the optimum library here. In the meantime, you might be interested in checking out BART summarization example here

LysandreJik

Thanks for the update! I'm starting to think that even these ONNX changes could potentially live in optimum and that we could have a requirement on optimum if we wanted to use ONNX anywhere in the library - but I understand this might be a bit complex to maintain in the long run.

I think this adds a lot of complexity but I understand why it's needed for goo ONNX support. Ok to merge it like this but would like to revisit this at some point in the near future to discuss the separation of ONNX features in optimum/transformers

src/transformers/models/bart/configuration_bart.py

src/transformers/models/t5/configuration_t5.py

…d models (huggingface#14358)" (huggingface#14679)" This reverts commit 0f4e39c.

…onverting multiple models sequentially

Avi-avidan · 2022-05-15T16:39:32Z

hi,
thanks HF team for your great support on this.
trying to export summarization bart
transformers.version == 4.19.0.dev0
onnxruntime.version == 1.11.1

from transformers import pipeline model_name = 'lidiya/bart-base-samsum' summarizer = pipeline("summarization", model=model_name, tokenizer=model_name)

`
from transformers import AutoConfig, AutoModelForSeq2SeqLM
from transformers.models.bart import BartOnnxConfig

config = AutoConfig.from_pretrained(model_name)
onnx_config = BartOnnxConfig(config, task="default")
print(onnx_config.outputs)
`
OrderedDict([('last_hidden_state', {0: 'batch', 1: 'decoder_sequence'})])

I am trying a few export options and non of the gives me the output from the decoder.

option 1:

/Users/aavidan/envs/py39/bin/python3.9 -m transformers.onnx --model=lidiya/bart-base-samsum --feature=seq2seq-lm --atol=5e-5 onnx

output 1:

huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using tokenizers before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
Using framework PyTorch: 1.10.2
Overriding 1 configuration item(s)
- use_cache -> False
/Users/aavidan/envs/py39/lib/python3.9/site-packages/transformers/models/bart/modeling_bart.py:230: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
if attn_weights.size() != (bsz * self.num_heads, tgt_len, src_len):
/Users/aavidan/envs/py39/lib/python3.9/site-packages/transformers/models/bart/modeling_bart.py:236: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
if attention_mask.size() != (bsz, 1, tgt_len, src_len):
/Users/aavidan/envs/py39/lib/python3.9/site-packages/transformers/models/bart/modeling_bart.py:267: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
if attn_output.size() != (bsz * self.num_heads, tgt_len, self.head_dim):
/Users/aavidan/envs/py39/lib/python3.9/site-packages/transformers/models/bart/modeling_bart.py:907: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
if input_shape[-1] > 1:
Validating ONNX model...
-[✓] ONNX model output names match reference model ({'logits'})
- Validating ONNX Model output "logits":
-[✓] (2, 8, 50265) matches (2, 8, 50265)
-[✓] all values close (atol: 5e-05)
All good, model saved at: onnx/model.onnx

`
from onnxruntime import InferenceSession, SessionOptions, GraphOptimizationLevel

options = SessionOptions()
options.graph_optimization_level = GraphOptimizationLevel.ORT_ENABLE_ALL

session = InferenceSession(
'onnx/model.onnx',
sess_options=options, providers=["CPUExecutionProvider"]
)

session.disable_fallback()

outputs = [i.name for i in session.get_outputs()]

feed_dict = summarizer.tokenizer(text)
feed_dict['decoder_input_ids'] = feed_dict['input_ids']
feed_dict['decoder_attention_mask'] = feed_dict['attention_mask']
feed_dict = {k: np.array([v]) for k, v in feed_dict.items()}
pred = session.run(None, feed_dict)

for i, p in enumerate(pred):
print(i, outputs[i], p.shape)
`

printout -

0 logits (1, 228, 50265)
1 1209 (1, 228, 768)

summarizer.tokenizer.decode(pred[0][0].argmax(axis=-1), skip_special_tokens=True)

what i get -

gives me back the input text, which basically means logits is simply the input_ids and I am guessing from the shape that 1209 is the encoded vectors for all tokens in the text input. if that is in fact the case, HOW DO I EXPORT THE base_model.decoder ?

option 2:

`
from pathlib import Path
from transformers.convert_graph_to_onnx import convert

convert(framework="pt", model=summarizer.model, output=Path(f"onnx/lidiya_bart1.onnx"),
opset=11, tokenizer=summarizer.tokenizer, pipeline_name="summarization")
`

this results in the following error -

using framework PyTorch: 1.10.2
found input input_ids with shape: {0: 'batch', 1: 'sequence'}
found input attention_mask with shape: {0: 'batch', 1: 'sequence'}
found output output_0 with shape: {0: 'batch', 1: 'sequence'}
found output output_1 with shape: {0: 'batch', 2: 'sequence'}
found output output_1 with shape: {0: 'batch', 2: 'sequence'}
found output output_1 with shape: {0: 'batch', 2: 'sequence'}
found output output_1 with shape: {0: 'batch', 2: 'sequence'}
found output output_2 with shape: {0: 'batch', 2: 'sequence'}
found output output_2 with shape: {0: 'batch', 2: 'sequence'}
found output output_2 with shape: {0: 'batch', 2: 'sequence'}
found output output_2 with shape: {0: 'batch', 2: 'sequence'}
found output output_3 with shape: {0: 'batch', 2: 'sequence'}
found output output_3 with shape: {0: 'batch', 2: 'sequence'}
found output output_3 with shape: {0: 'batch', 2: 'sequence'}
found output output_3 with shape: {0: 'batch', 2: 'sequence'}
found output output_4 with shape: {0: 'batch', 2: 'sequence'}
found output output_4 with shape: {0: 'batch', 2: 'sequence'}
found output output_4 with shape: {0: 'batch', 2: 'sequence'}
found output output_4 with shape: {0: 'batch', 2: 'sequence'}
found output output_5 with shape: {0: 'batch', 2: 'sequence'}
found output output_5 with shape: {0: 'batch', 2: 'sequence'}
found output output_5 with shape: {0: 'batch', 2: 'sequence'}
found output output_5 with shape: {0: 'batch', 2: 'sequence'}
found output output_6 with shape: {0: 'batch', 2: 'sequence'}
found output output_6 with shape: {0: 'batch', 2: 'sequence'}
found output output_6 with shape: {0: 'batch', 2: 'sequence'}
found output output_6 with shape: {0: 'batch', 2: 'sequence'}
found output output_7 with shape: {0: 'batch', 1: 'sequence'}
ensuring inputs are in correct order
decoder_input_ids is not present in the generated input list.
generated inputs order: ['input_ids', 'attention_mask']

ValueError Traceback (most recent call last)
Input In [10], in <cell line: 6>()
3 from transformers.convert_graph_to_onnx import convert
5
----> 6 convert(framework="pt", model=summarizer.model, output=Path(f"onnx/lidiya_bart1.onnx"),
7 opset=11, tokenizer=summarizer.tokenizer, pipeline_name="summarization")

File ~/envs/py39/lib/python3.9/site-packages/transformers/convert_graph_to_onnx.py:395, in convert(framework, model, output, opset, tokenizer, use_external_format, pipeline_name, **model_kwargs)
393 # Export the graph
394 if framework == "pt":
--> 395 convert_pytorch(nlp, opset, output, use_external_format)
396 else:
397 convert_tensorflow(nlp, opset, output)

File ~/envs/py39/lib/python3.9/site-packages/transformers/convert_graph_to_onnx.py:285, in convert_pytorch(nlp, opset, output, use_external_format)
282 # PyTorch deprecated the enable_onnx_checker and use_external_data_format arguments in v1.11,
283 # so we check the torch version for backwards compatibility
284 if parse(torch.version) <= parse("1.10.99"):
--> 285 export(
286 nlp.model,
287 model_args,
288 f=output.as_posix(),
289 input_names=ordered_input_names,
290 output_names=output_names,
291 dynamic_axes=dynamic_axes,
292 do_constant_folding=True,
293 use_external_data_format=use_external_format,
294 enable_onnx_checker=True,
295 opset_version=opset,
296 )
297 else:
298 export(
299 nlp.model,
300 model_args,
(...)
306 opset_version=opset,
307 )

File ~/envs/py39/lib/python3.9/site-packages/torch/onnx/init.py:316, in export(model, args, f, export_params, verbose, training, input_names, output_names, operator_export_type, opset_version, _retain_param_name, do_constant_folding, example_outputs, strip_doc_string, dynamic_axes, keep_initializers_as_inputs, custom_opsets, enable_onnx_checker, use_external_data_format)
38 r"""
39 Exports a model into ONNX format. If model is not a
40 :class:torch.jit.ScriptModule nor a :class:torch.jit.ScriptFunction, this runs
(...)
312 model to the file f even if this is raised.
313 """
315 from torch.onnx import utils
--> 316 return utils.export(model, args, f, export_params, verbose, training,
317 input_names, output_names, operator_export_type, opset_version,
318 _retain_param_name, do_constant_folding, example_outputs,
319 strip_doc_string, dynamic_axes, keep_initializers_as_inputs,
320 custom_opsets, enable_onnx_checker, use_external_data_format)

File ~/envs/py39/lib/python3.9/site-packages/torch/onnx/utils.py:109, in export(model, args, f, export_params, verbose, training, input_names, output_names, operator_export_type, opset_version, _retain_param_name, do_constant_folding, example_outputs, strip_doc_string, dynamic_axes, keep_initializers_as_inputs, custom_opsets, enable_onnx_checker, use_external_data_format)
104 if use_external_data_format is not None:
105 warnings.warn("`use_external_data_format' is deprecated and ignored. Will be removed in next "
106 "PyTorch release. The code will work as it is False if models are not larger than 2GB, "
107 "Otherwise set to False because of size limits imposed by Protocol Buffers.")
--> 109 _export(model, args, f, export_params, verbose, training, input_names, output_names,
110 operator_export_type=operator_export_type, opset_version=opset_version,
111 do_constant_folding=do_constant_folding, example_outputs=example_outputs,
112 dynamic_axes=dynamic_axes, keep_initializers_as_inputs=keep_initializers_as_inputs,
113 custom_opsets=custom_opsets, use_external_data_format=use_external_data_format)

File ~/envs/py39/lib/python3.9/site-packages/torch/onnx/utils.py:728, in _export(model, args, f, export_params, verbose, training, input_names, output_names, operator_export_type, export_type, example_outputs, opset_version, do_constant_folding, dynamic_axes, keep_initializers_as_inputs, fixed_batch_size, custom_opsets, add_node_names, use_external_data_format, onnx_shape_inference)
726 if dynamic_axes is None:
727 dynamic_axes = {}
--> 728 _validate_dynamic_axes(dynamic_axes, model, input_names, output_names)
730 graph, params_dict, torch_out =
731 _model_to_graph(model, args, verbose, input_names,
732 output_names, operator_export_type,
(...)
735 training=training,
736 dynamic_axes=dynamic_axes)
738 # TODO: Don't allocate a in-memory string for the protobuf

File ~/envs/py39/lib/python3.9/site-packages/torch/onnx/utils.py:1314, in _validate_dynamic_axes(dynamic_axes, model, input_names, output_names)
1312 for i, x in enumerate(value):
1313 if not isinstance(x, int):
-> 1314 raise ValueError("The type of axis index is expected to be an integer")
1315 if x in value_dict:
1316 warnings.warn("Duplicate dynamic axis index {} was provided for input {}."
1317 .format(x, key))

ValueError: The type of axis index is expected to be an integer

btw, same error when trying to export only the decoder using
convert(framework="pt", model=summarizer.model.base_model.decoder, output=Path(f"onnx/lidiya_dec.onnx"), opset=11, tokenizer=summarizer.tokenizer, pipeline_name="summarization")

option 3:

torch.onnx.export( summarizer.model, (inputs['input_ids'], inputs['attention_mask']), 'onnx/lidiya_torch_onnx_exp.onnx', opset_version=11, )

what i get -

like option 1, successfully exports the encoder (I assume by looking at the exported layers shapes). I still have an issue exporting the decoder.

btw, I saw a bunch of references of how to implement the beam serach, however all links gives are broken/NR, so please can you re-post link to that as well?

thanks a lot!

Apetree100122 · 2025-02-03T01:45:49Z

import  graph.onnx 
 convert { err }convert (framework="pt", model::summarize ,r.model,output=Path( f:\ onnx
 \ lidiya_bart1.onnx"), opset=11,tokenizer=summarize, r.tokenize , r,pipeline_name: "summarization")
// envs/py39/lib/python3.9/site_packages/transformers/convert_graph_to_onnx.py:395, 
in convert(framework, model, output, opset,tokenizer, use_external_format, pipeline_name,** model_kwargs)
 # Export the graph framework == "pt": --> convert_pytorch
(nlp, opset,
 output,
 use_external_format)  else: convert_tensorflow
(nlp, opset, output) envs/py39/lib/python3.9/site-packages/transformers/convert_graph_to_onnx.py:  convert_py.torch(n - lp, opset, 
output, use_external_format) # PyTorch 
enable_onnx_checker  use_external_data_format 
arguments in v1.11,# py.torch version bakwards compatibile {x.x.x}  parse(torch.version) <= parse("1.10

lewtun reviewed Dec 9, 2021

View reviewed changes

tests/test_onnx_v2.py Show resolved Hide resolved

michaelbenayoun requested a review from LysandreJik December 13, 2021 19:23

LysandreJik reviewed Dec 14, 2021

View reviewed changes

src/transformers/models/bart/configuration_bart.py Outdated Show resolved Hide resolved

src/transformers/models/mbart/configuration_mbart.py Outdated Show resolved Hide resolved

This was referenced Dec 14, 2021

Add ONNX support for MarianMT models #14586

Merged

Bart model converted ONNX inference #14222

Closed

Add Tensorflow handling of ONNX conversion #13831

Merged

michaelbenayoun requested a review from LysandreJik December 15, 2021 17:29

michaelbenayoun force-pushed the onnx_enable_tasks_for_supported_models_part_2 branch from c8cc572 to 600e5f2 Compare December 21, 2021 17:52

LysandreJik approved these changes Dec 22, 2021

View reviewed changes

src/transformers/models/bart/configuration_bart.py Show resolved Hide resolved

src/transformers/models/t5/configuration_t5.py Outdated Show resolved Hide resolved

michaelbenayoun added 11 commits December 22, 2021 12:40

Revert "Revert "Added support for other features for already supporte…

2473e8b

…d models (huggingface#14358)" (huggingface#14679)" This reverts commit 0f4e39c.

is_torch_available test to avoid failing imports

7e2d907

sorting parameterize parameters to solve ERROR gw0 gw1

e75e518

tests fix

3f613cc

tests fix

39aae17

GPT2 with past fix

68c93e9

Fixed stateful class attribute change that was breaking things when c…

0d8eb5a

…onverting multiple models sequentially

Removed onnx file

83970be

Implemented suggestions

3898e1c

Fixed __init__ to resolve conflict with master

050c5b8

Remove commented import

b35705e

michaelbenayoun force-pushed the onnx_enable_tasks_for_supported_models_part_2 branch from 1d873a3 to b35705e Compare December 22, 2021 11:41

michaelbenayoun merged commit 13504dc into huggingface:master Dec 22, 2021

lewtun mentioned this pull request Jan 12, 2022

Error while converting distilbart-mnli-12-1 model to ONNX #15123

Closed

lewtun mentioned this pull request Mar 1, 2022

Added support for other features for already supported models #14358

Merged

This was referenced Oct 17, 2022

Add ONNX support for Embeddings and Pipelines neuml/txtai#109

Closed

Change the HFOnnx pipeline to use Hugging Face Optimum rather than onnxruntime directly neuml/txtai#371

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Onnx enable tasks for supported models (part 2)#14700

Onnx enable tasks for supported models (part 2)#14700
michaelbenayoun merged 11 commits intohuggingface:masterfrom
michaelbenayoun:onnx_enable_tasks_for_supported_models_part_2

michaelbenayoun commented Dec 9, 2021 •

edited by LysandreJik

Loading

Uh oh!

Uh oh!

LysandreJik left a comment

Uh oh!

Uh oh!

Uh oh!

lewtun commented Dec 17, 2021

Uh oh!

sorenmc commented Dec 21, 2021

Uh oh!

lewtun commented Dec 21, 2021

Uh oh!

LysandreJik left a comment

Uh oh!

Uh oh!

Uh oh!

Avi-avidan commented May 15, 2022 •

edited

Loading

Uh oh!

Apetree100122 commented Feb 3, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

Conversation

michaelbenayoun commented Dec 9, 2021 • edited by LysandreJik Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

What does this PR do?

Uh oh!

Uh oh!

LysandreJik left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

lewtun commented Dec 17, 2021

Uh oh!

sorenmc commented Dec 21, 2021

Uh oh!

lewtun commented Dec 21, 2021

Uh oh!

LysandreJik left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Avi-avidan commented May 15, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

option 1:

output 1:

printout -

what i get -

option 2:

this results in the following error -

option 3:

what i get -

Uh oh!

Apetree100122 commented Feb 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

michaelbenayoun commented Dec 9, 2021 •

edited by LysandreJik

Loading

Avi-avidan commented May 15, 2022 •

edited

Loading

Apetree100122 commented Feb 3, 2025 •

edited

Loading