Past weekend I finetuned LLaMA 7B and 13B following the Stanford Alpaca repo, on 20k code generation/editing/optimization instructions. The 13B model performs impressively at small-scoped well defined insttructions. I have released the code and data here -
Sahil Chaudhary
441 posts
- Releasing instruct-codegen-16B today. It is a finetuned version of codegen-16B-multi on a dataset of 250k alpaca style codegen instruction samples, and achieves a pass@1 of 37.1%
- Excited to announce that I'm building @GlaiveAI , helping companies train use-case specific small language models with the help of synthetic data with the goal of commoditising language models
- Built a gradio demo to try out CodeAlpaca on huggingface spaces -
- Really excited to share that @glaiveAI has raised a $3.5M seed round led by @sparkcapital with participation from @villageglobal and @amasad glaive.ai/blog/seed-roun…
- Replying to @csahil28I’m releasing model weights, training data, scripts, and eval code to help reproduce benchmark scores. Postmortem- glaive.ai/blog/post/refl… Weights- huggingface.co/glaiveai/Refle… Eval code- github.com/glaive-ai/simp… Training code- github.com/glaive-ai/refl… @ricklamers has also put
- Releasing a new code model and dataset today, glaive-coder-7b and the 130k+ samples used to train the model licensed as apache-2.0. Instead of just reporting the benchmarks, we also release the Code Models Arena to change the way we evaluate code models. glaive.ai/blog/releasing…
- Finetuned replit-code-3B on a glaive generated dataset consisting of 1B tokens gets pass@1 of 63.5 huggingface.co/sahil2801/repl…
- Replying to @csahil28Along with the announcement, releasing a 2.7B open source model with similar function calling abilities as gpt-3.5 while being significantly smaller.
- For anyone wanting to use CodeAlpaca as an api, you can deploy this template on @BananaDev_ in <5 mins - app.banana.dev/templates/sahi…
- Nothing beats GPTrillion banana.dev/blog/introduci…
- Excited to share what we have been building for the past few months. If you are building an AI powered product, would love to onboard you to Glaive.
- Working on some very exciting tech @BananaDev_ to unlock a true serverless gpu experience, this is what cold starts on banana are gonna look like soon
00:00




