It seems @huggingface and @MistralAI are sharing some secrets, and I've just found out them over the docs. Love you guys!
Machine Learning and AI researcher wayyy before all this hype.
Joined December 2023
- Ladies and Gentlemen! @erhartford , @latkins and me are preparing to let all of you fucking crazy ... THIS IS THE BEST DOLPHIN RELEASE FUCKING EVER!!!!! 8B MODEL BEASTTTTTT
- Me and @erhartford are pleased to announce [maybe] the first successful LASER model @ @huggingface. Our model showed superior benchmarks over our latest DPO version of Dolphin finetune over Mistral AI's Mistral 7b. Pt 1
- Hi, folks! Me, @DavidGFar , @latkins and @erhartford cannot stop inventing new crazy stuff. Now we are delighted to announce Kraken, sponsored by @HyperspaceAI and @VAGOsolutions. (1/N)
- Hi! Me and @erhartford are opensourcing our LaserRMT implementation of the original Laser Paper. We improved the search algorithm by employing random matrix theory and Marchenko-Pastur theory. Let's get loads of models being 'lasered" @huggingface
- LLMs and autistic people: For those who doesn't know, I am the father of an autistic child and an LLM researcher. One thing has caught my attention about a slight similarity between the behavior of autistic children and LLMs; and eventually our definition of intelligence. (1/4)
- Me, @erhartford and David Golchinfar are pleased to announce our new model. Cognitive Computations - Laserxtral 4x7B. This is basically a MoE done using the mergekit provided by Charles Goddard. This model exhibts strong reasoning capabilities and truthfulness. (1/3)
- This was a very smart trick we have had with @erhartford . We have created a small HF Transformer = PyTorch hack to enable an "online passthrough" frankenmerge that loops in the forward method. Hence we have the same model results, but way less vRAM use. We are excited! (1/2)
- After some small pushing from @ivanfioravanti , we (me, @erhartford and @DavidGFar) are just releasing scripts for laserRMT compatible to MPS. So now modelers can scan their models and laser them. Thanks @HyperspaceAI and @VAGOsolutions for the support.
- Replying to @FernandoNetoAi... Yes you can. You can mixup whatever you can. And we are open sourcing the whole pipeline to achieve that as well. Welcome to Kraken! [GitHub]: github.com/cognitivecompuโฆ [Demo Model]: huggingface.co/cognitivecompuโฆ
- Now it is OFFICIAL! BTW, it's MMLU score is VERY close to gpt4 (86.9) I don't wanna talk too much, but this is the SOTA in open source models. So glad to be working with Eric and @latkins on enabling this. Thanks @Alibaba_Qwen for the excellent base model!Cognitive Computations presents Dolphin-2.9.2-Qwen2-72b. The best Dolphin ever. Thanks to @Alibaba_Qwen for the excellent base model! 83.9 mmlu and 128k context! New in 2.9.2 is SystemChat - A dataset designed to teach the model to obey the system prompt, even over a long
- Me, @DavidGFar and @erhartford are proud to share our new notebook (Laser Qlora). How can we spot layers that are more prone to absorb new knowledge and continue further fine-tuning a pre-existing sft model??? Thanks @HyperspaceAI and @VAGOsolutions for supporting. (Link below)
- And the best 7b Model @ HF leaderboard is a LaserRMT one <3 ... Feeling proud with @erhartford and @DavidGFar ... Congratulations for Tim Dollan!










