Conversation
|
Hi @jiqing-feng , just to be sure, is this concerning the ONNX export or ONNX Runtime integration? Or both? In the ONNX export, we should give the option, yes. |
|
Currently |
|
Hi @echarlaix @fxmarty thanks for your comments. I have added |
|
Hi @jiqing-feng , could you comment a bit on your use case? I'm not sure what's the benefit of this in the ORT integration. While Am I misunderstanding something @gante ? |
|
@fxmarty What I write next is from a At the moment, This means that while |
So this means that executing this controlflow is absolutely necessary, right? |
@fxmarty Yeah, I was about to request for exactly the same feature. As @gante mentioned above, the
I don't think the controlflow of computing the |
michaelbenayoun
left a comment
There was a problem hiding this comment.
On my side, I am ok with adding this as long as we make sure we do not:
- Add any break changes
- Do not break
.generate, it should work with and without theposition_ids(except for the left padding case of course)
| @property | ||
| def inputs(self) -> Dict[str, Dict[int, str]]: | ||
| if self.use_past_in_inputs: | ||
| common_inputs = {"input_ids": {0: "batch_size"}} | ||
| self.add_past_key_values(common_inputs, direction="inputs") | ||
| common_inputs["attention_mask"] = {0: "batch_size", 1: "past_sequence_length + 1"} | ||
| common_inputs["position_ids"] = {0: "batch_size"} | ||
| else: | ||
| common_inputs = { | ||
| "input_ids": {0: "batch_size", 1: "sequence_length"}, | ||
| "attention_mask": {0: "batch_size", 1: "sequence_length"}, | ||
| "position_ids": {0: "batch_size", 1: "sequence_length"}, | ||
| } | ||
| return common_inputs | ||
|
|
There was a problem hiding this comment.
I guess we can just make LlamaOnnxConfig imherint from GPT2OnnxConfig and override the NORMALIZED_CONFIG_CLASS class attribute?
| input_ids=input_ids, | ||
| attention_mask=attention_mask, | ||
| past_key_values=past_key_values, | ||
| position_ids=position_ids, |
There was a problem hiding this comment.
self.decoder is an ORTDecoder right?
If so, we also need to update its forward method to handle position_ids.
| position_ids=position_ids, | ||
| ) | ||
| else: | ||
| outputs = self.decoder_with_past( | ||
| input_ids=input_ids[:, -1:], | ||
| past_key_values=past_key_values, | ||
| attention_mask=attention_mask, | ||
| position_ids=position_ids, |
|
Hi @michaelbenayoun , thanks for your comment. I have updated position_ids in ORTdecoder forward. Could you please review it? Thanks! cc @fxmarty @gante |
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. |
|
Hi everyone, @jiqing-feng @fxmarty @gante |
|
|
@jiqing-feng @KexinFeng I now realize I misunderstood your argument and motivation for the PR. This appears to me to be a major bug in Optimum which should be fixed ASAP. |
|
Closing in favor of #1381 |
Some models like gpt2 and llama have "position_ids" in their generation inputs. We can add "position_ids" in the model's config to fix it.
@echarlaix @fxmarty Would you please help to review it? Thanks!