Efficient, Flexible and Portable Structured Generation
- [2026/5] XGrammar-2 has been released! Check out our blog for more information.
- [2025/12] XGrammar has been officially integrated into Mirai
- [2025/09] XGrammar has been officially integrated into OpenVINO GenAI
- [2025/02] XGrammar has been officially integrated into Modular's MAX
- [2025/01] XGrammar has been officially integrated into TensorRT-LLM.
- [2024/12] XGrammar has been officially integrated into vLLM.
- [2024/12] We presented research talks on XGrammar at CMU, UC Berkeley, MIT, THU, SJTU, Ant Group, LMSys, Qingke AI, Camel AI. The slides can be found here.
- [2024/11] XGrammar has been officially integrated into SGLang.
- [2024/11] XGrammar has been officially integrated into MLC-LLM.
- [2024/11] We officially released XGrammar v0.1.0!
XGrammar is an open-source library for efficient, flexible, and portable structured generation.
It leverages constrained decoding to ensure 100% structural correctness of the output. It supports general context-free grammar to enable a broad range of structures, including JSON, regex, custom context-free grammar, etc.
XGrammar uses careful optimizations to achieve extremely low overhead in structured generation. It has achieved near-zero overhead in JSON generation, making it one of the fastest structured generation engines available.
XGrammar features universal deployment. It supports:
- Platforms: Linux, macOS, Windows
- Hardware: CPU, NVIDIA GPU, AMD GPU, Apple Silicon, TPU, etc.
- Languages: Python, C++, JavaScript, and Swift APIs
- Models: Qwen, Llama, DeepSeek, Phi, Gemma, etc.
XGrammar is very easy to integrate with LLM inference engines. It is the default structured generation backend for most LLM inference engines, including vLLM, SGLang, TensorRT-LLM, and MLC-LLM, as well as many other companies. You can also try out their structured generation modes!
Install XGrammar:
pip install xgrammarFor use with MPS on Apple Silicon, install with:
pip install "xgrammar[metal]"Import XGrammar:
import xgrammar as xgrPlease visit our documentation to get started with XGrammar.
- Rust: xgrammar-rs — Community Rust bindings for XGrammar.
XGrammar has been widely adopted in industry, open-source projects, and academia. Our collaborators include:
If you find XGrammar useful in your research, please consider citing our papers:
@article{dong2024xgrammar,
title={Xgrammar: Flexible and efficient structured generation engine for large language models},
author={Dong, Yixin and Ruan, Charlie F and Cai, Yaxing and Lai, Ruihang and Xu, Ziyi and Zhao, Yilong and Chen, Tianqi},
journal={Proceedings of Machine Learning and Systems 7},
year={2024}
}
@inproceedings{10.1145/3786335.3813124,
author = {Li, Linzhang and Dong, Yixin and Wang, Guanjie and Xu, Ziyi and Jiang, Alexander and Chen, Tianqi},
title = {XGrammar-2: Dynamic and Efficient Structured Generation Engine for Agentic LLMs},
year = {2026},
isbn = {9798400724152},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
url = {https://doi.org/10.1145/3786335.3813124},
booktitle = {Proceedings of the ACM Conference on AI and Agentic Systems},
pages = {1009--1022},
numpages = {14}
}





