Moritz Thüning

Results 4 comments of Moritz Thüning

you allocate dim elements for the quants and also dim elements for the scaling factors. but there are only dim / gs scaling factors :)

Any progress on this? I have a related issue in TT-Boltz: https://github.com/moritztng/tt-boltz/issues/2

What are those workarounds? In my opinion matmul should support broadcasting the batch dim.

The third workaround worked for me. Thanks a lot!