Skip to content

[Feature] QAT scheme: A16W8 Int8 WeightOnly Quantization #3845

@electroglyph

Description

@electroglyph
from torchao.quantization import quantize_, Int8WeightOnlyConfig
quantize_(model, Int8WeightOnlyConfig())

this scheme would be nice to have for Gemma models (and others, too, i assume)

i might see about getting around to this one

Metadata

Metadata

Assignees

No one assigned

    Labels

    feature requestFeature request pending on roadmap

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions