A lot of the insider knowledge on how to build an LLM has gone underground in the last 24 months.
We are going to build #SnowflakeArctic in the open
Model arch ablations, training and inference system performance, dataset and data composition ablations, post-training fun, big
Vivek Raghunathan
1,023 posts
* SVP eng at @snowflakedb.
* AI + search at @snowflakedb.
* Co-founder @Neeva . #NeevaAI = AI search.
* Ex-VP of Eng @Google (ads,YT, Google Now)
- Blown away by @OpenAI GTM. Great launch: 🚩 Many months of pre-launch buzz 🚩 Soft launch on Bing 🚩 Tons of embargoed launch partners 🚩 Paywall to drive ChatGPT-Plus signups 🚩 From arxiv to marketing white papers 🚩 s/SOTA on MMLU/GPT-4 can get into Stanford/g Congrats ...
- A lot of the insider knowledge on how to build an LLM has gone underground in the last 24 months. We promised to build #SnowflakeArctic in the open, and here we are, with the third edition of our cookbook series, this time on data ... Data ablations are the lifeblood of any LLM
- I went to Google straight out of school, and spent 12 years there in a variety of engineering roles. Contrary to all the advice you'll read about levels and scope and responsibility and setting yourself up for success, growing your career in any eng org is pretty simple. 👇
- World-class search needs world-class metrics 🚀 And great metrics need to be constantly evolved to avoid overfitting 🤟 As we build out a great search experience at @SnowflakeDB AI research, we are excited to join forces with @lintool and the University of Waterloo 🙌 Our
- OpenAI GTM continues to be on a tear. Without looking at the details, just the sheer momentum of announcements and partnerships has me astounded openai.com/blog/chatgpt-p…
- Excited to announce #SnowflakeArctic, our new OSS LLM. Play with it at arctic.streamlit.app Read our cookbook at snowflake.com/en/data-cloud/… Read our blog at snowflake.com/blog/arctic-op… We are just getting started ....@SnowflakeDB is thrilled to announce #SnowflakeArctic: A state-of-the-art large language model uniquely designed to be the most open, enterprise-grade LLM on the market. This is a big step forward for open source LLMs. And it’s a big moment for Snowflake in our #AI journey as
00:00 - Excited to be joining forces w/ Frank, Benoit, Thierry, Christian, Greg, and everyone at @SnowflakeDB !We are thrilled to announce @Neeva is joining @SnowflakeDB. We're bringing our expertise in search, AI + LLMs to Snowflake’s customers to help them safely & effectively realize the power of their data. We remain forever grateful for all the support we received from all of you.
- I asked GPT-3 to produce a funny limerick about privacy, search and ads. Here you go: There once was a user who whined "I don't want my data mined I don't like being spied on And I hate all these ads!" But the search engines just smiled And said "We'll do it anyway!" Impressive
- Excited to partner with @DaniYogatama @YiTayML and the @RekaAILabs to bring them to @SnowflakeDB Cortex. @RekaAILabs is the sleeper in the LLM wars. * Consistently top-tier models ✅ * Upcoming Reka Core model approaching GPT-4 ✅ * Top-tier team ✅ * Impressive execution ✅As a part of our commitment to helping our customers unlock the power of #AI on all types of data, we’re furthering our partnership with @RekaAILabs to bring gen AI to images, video and more to Snowflake Cortex. Learn more about our partnership: snowflake.com/blog/multimoda…
- More thinking re: the OpenAI ChatGPT GPT-4 plugin launch ... The browsing plugin (WebGPT) uses Bing search. Bing just increased prices on that API 30x (from $7 for 1k requests to $200 for 1k requests) for AI applications OpenAI could not have pulled this off w/o MSFT ...
- At @Neeva , bi-encoders were 🔑 to great web retrieval performance. Smaller enterprise corpora => optimizing for recall is even more important in enterprise AI search. #AI #MachineLearning For the techniques that work, read the blog post, or get the tldr from the 🧵 ...New blog post from me and my colleagues at Snowflake — an explainer on training text embedding models (a key technology behind modern search). A moderately deep dive into the techniques that several top-scoring models use to improve performance. medium.com/snowflake/how-…





