Skip to content

ONNX export constant folding messes up with shared weight deduplication #108342

@fxmarty

Description

@fxmarty

🐛 Describe the bug

Hi, it appears that using do_constant_folding=True in the ONNX export will undo some weight deduplication. For example, an nn.Linear weight will go from

image

&

image

to

image

effectively transposing the weight. Given that the DeduplicateInitializersByDataPtr relies on the tensor size, in case the shared weight has a different size (e.g. an embedding weight), the deduplication pass will fail.

It seems to me that the initializer deduplication should happen before the constant folding, and constant folding should be done only for non-shared weights.

WDTY @justinchuby @BowenBao ?

Thank you!

Repro:

pip install optimum
optimum-cli export onnx -m bigscience/bloom-560m bloom_onnx --no-post-process

and inspect the output with netron

Versions

Both on 2.0.1 and nightly

Metadata

Metadata

Assignees

No one assigned

    Labels

    module: onnxRelated to torch.onnxtriagedThis issue has been looked at a team member, and triaged and prioritized into an appropriate module

    Type

    No type

    Projects

    Status

    Reopened

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions