Skip to content

[Feature] Support deterministic inference with Batch Invariant Ops #10278

@Fridge003

Description

@Fridge003

Backbone: Attention Backend, batch_invariant library integration

Communication (NCCL)

Radix Cache Support

Model Support

Quantization

Parallelism

Spec Decoding

Perf

Issues

Usability & Documentation

Related resources

https://thinkingmachines.ai/blog/defeating-nondeterminism-in-llm-inference/
https://github.com/thinking-machines-lab/batch_invariant_ops

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions