The Tinker API recently released by Thinking Machines will have a big impact on how people think about
post-training and inference systems. To allow more people to experiment with
Tinker like systems and run it on their own hardware, we started SkyRL tx 🧸, an open source project
Philipp Moritz
124 posts
Co-founder and CTO at @anyscalecompute. Co-creator of @raydistributed. Interested in ML, AI, computing.
San Francisco
Joined April 2017
- Very excited to see the Tinker release by @thinkymachines! @robertnishihara and I had a chance to experiment with the API, see anyscale.com/blog/fine-tuni…. It does a nice job of providing flexibility while abstracting away GPU handling. This will be 🔥 when combined with
- If you are doing LLM inference, FP8 is almost a no-brainer (almost no accuracy loss, support 2x larger models with the same memory, up to 2x faster). We recently contributed FP8 support to vLLM -- check it out!We’ve recently contributed FP8 support to the @vllm_project in collaboration with @neuralmagic. With this feature, you can see up to a 1.8x reduction in inter-token latency, with >99% accuracy preservation! 1/n
- We are happy to release SkyRL tx 0.0.2, an open source library that implements a backend for the Thinking Machine Tinker API and allows people to set up their own Tinker-like service running on their own hardware. There is lots of new features and it is exciting to see the first
- We are happy to release SkyRL tx 0.1 novasky-ai.notion.site/skyrl-tx-v010, an open source unified training and inference engine that supports the Tinker API. This release has many performance enhancements and also new features but most importantly RL training is now working end-to-end. If you
- We are happy to announce SkyRL tx 0.0.3! SkyRL tx is an open source library that implements a backend for the Tinker API and allows people to set up their own Tinker-like service running on their own hardware. This release has full MoE support, better checkpointing and the first
- Thanks to vLLM, Anyscale Endpoints is at the top of the LLM performance leaderboard 🚀. We are excited to merge more advanced performance optimizations & features like speculative decoding and per-request LoRA adapters upstream soon, stay tuned!📈We’re excited to introduce the LLMPerf leaderboard: the first public and open source leaderboard for benchmarking performance of various LLM inference providers in the market. Our goal with this leaderboard is to equip users and developers with a clear understanding of the
- There have been a lot of open source RL libraries for training LLMs popping up recently. We took a stab at describing some of the use cases and design decisions they are optimized for:
- Check out this recent blog post blog.vllm.ai/2025/04/23/ope… which describes how OpenRLHF runs on top of @raydistributed and @vllm_project
- After using uv for a while, I think it finally solves most Python dependency problems. Ray and uv fit together perfectly to make package management on a cluster seamless. Check our blog post
- Do you find it challenging to run RL / agent simulations at a large scale (e.g. dealing with docker and remote execution)? Check out our blog post anyscale.com/blog/massively… where we show how to do it with Ray and mini-swe-agent (kudos to @KLieret)
- 🚀 We are introducing SkyRL-v0.1: A highly-modular RL library for training LLMs! ✨ Key features: 1) Simple modular design – adapt to your needs by implementing core interfaces 2) 1.8x faster training with async rollouts 3) Optional built-in gymnasium of tool-use tasks (math,
- If you have been working on vLLM related projects (e.g. contributions to vLLM like optimizations or new features, vLLM deployment strategies, or interesting use cases and applications), consider submitting a talk proposal! The vLLM and Ray community would love to hear about it :)There has been so much excitement and activity around this topic, that we are adding a vLLM track to the Ray Summit! If you contribute to or use @vllm_project, we want to hear from you. raysummit.anyscale.com










