📢 Introducing MPT: a new family of open-source commercially usable LLMs from @MosaicML. Trained on 1T tokens of text+code, MPT models match and - in many ways - surpass LLaMa-7B. This release includes 4 models: MPT-Base, Instruct, Chat, & StoryWriter (🧵)
mosaicml.com/blog/mpt-7b
Databricks AI Research
1,057 posts
We remove the barriers to state-of-the-art generative AI model development and make data + AI available to all.
- 📣Announcing MosaicML Inference 📣 Ever wanted a text or image generation API that doesn’t make you send data to a third party? Or a cheaper solution than paying by the token? Or an easy way to get a trained model into production? We can help with that. 🧵
- Introducing training LLMs with AMD hardware! MosaicML + PyTorch 2.0 + ROCm 5.4+ = LLM training out of the box with zero code changes. With MosaicML, the ML community has additional hardware + software options to choose from. Read more: mosaicml.com/blog/amd-mi250
- Meet MPT-30B, the latest member of @MosaicML's family of open-source, commercially usable models. It's trained on 1T tokens with up to 8k context (even more w/ALiBi) on A100s and *H100s* with big improvements to Instruct and Chat. Take it for a spin on HF! huggingface.co/spaces/mosaicm…
- Meet PubMed GPT 🩺 a new SOTA on the US Medical Licensing Exam developed by MosaicML and @StanfordHAI. It's a normal GPT-3B model trained on medical data that bests hand-designed med models and generic models 40x bigger, a sweet spot for foundation models🧵mosaicml.com/blog/introduci…
- [1/8] Full technical details on our Stable Diffusion 2.0 speedrun are here! On Wednesday, we announced that we had replicated SD2 for < $50k, 2.7x over our baseline and 6x over Stability's number. Today, we share the technical nitty-gritty on how we did it:
- Woo hoo! 🙌What an honor to make the @Forbes AI 50 List. MosaicML empowers you build your own #GenerativeAI. Train, finetune, and deploy your custom #LLM today: mosaicml.com
- Got an extra $20 burning a hole in your wallet? With the MosaicBERT architecture + training recipe, you can now pretrain a competitive #BERT-Base model from scratch on the MosaicML platform for the cost of a large pizza! 🍕⚡️👏 Learn more: mosaicml.com/blog/mosaicbert
- Announcing MPT-7B-8K: a 7B parameter open-source LLM with 8k context length trained with the MosaicML platform. With its 8k context length, MPT-7B-8K specializes in document summarization and question-answering, and may be used commercially. Read more: mosaicml.com/blog/long-cont…
- We have exciting news! In our latest and greatest LLM blog, we show how MosaicML Cloud can help you train LLMs from 1B - 70B parameters, and for the first time, publish transparent times + costs for doing so. It's a lot cheaper than you think! (1/9)
- 📢 MosaicML Cloud is now available for early access! Create advanced AI models faster and cheaper than you thought possible. mosaicml.com/blog/introduci…
- The MosaicML team is excited to present at the @wandb webinar this Thursday, 23-Feb-2023, 7PM CET/10AM PST! Our very own @leavittron will be joining W&B's @carey_phelps to showcase MosaicML #LLM training and W&B's Model Registry. Register at webinar.mosaicml.wandb.events/?utm_source=so…











