Alpin
4,545 posts
Every age, it seems, is tainted by the greed of men. Rubbish to one such as I, devoid of all worldly wants. — I work on HPC and making AI run faster.
Joined January 2020
- Perks of working with datacenters. SSD read/write speeds can't catch up with your network speeds.
- This is a myth. Only a handful of people are ensouled, mostly my mutuals. Overwhelming majority of people are NPCs.
- "Do they shoot you in San Francisco if you use capital letters?"is it possible to pretrain a language model using pure reinforcement learning from scratch? random weights, no cross-entropy loss pretraining. you may have many questions in your head.
- Git but with private branches on public repos.
- Replying to @MesugakiArchiveIt was so fucking bad. Almost everything about discount evangelion was bad.
- I'm slowly being vindicated on diffusion language models. Humans don't think auto-regressively (one word at a time), we have a general outline of a thought that clears up over time, then we put that into words. The only exception is when you're not thinking clearly and just sayLarge Language Diffusion Models Introduces LLaDA-8B, a large language diffusion model that pretrained on 2.3 trillion tokens using 0.13 million H800 GPU hours, followed by SFT on 4.5 million pairs. LLaDA 8B surpasses Llama-2 7B on nearly all 15 standard zero/few-shot learning
- Replying to @ShitpostRockJust looked into it and this feels like such a huge nothingburger wtf
- Have you ever wanted a centralized TUI to monitor the GPU/CPU/Disk Usage on your remote servers? Well I wanted it, so I built one today. Supports both NVIDIA and AMD GPUs.
- Releasing: a dataset of two million Bluesky posts. This dataset has been collected using Bluesky's API, and I hope it will be useful for all the researchers out there!
















