Moritz Thüning
Moritz Thüning
you allocate dim elements for the quants and also dim elements for the scaling factors. but there are only dim / gs scaling factors :)
Any progress on this? I have a related issue in TT-Boltz: https://github.com/moritztng/tt-boltz/issues/2
What are those workarounds? In my opinion matmul should support broadcasting the batch dim.
The third workaround worked for me. Thanks a lot!