each AI company's models are special in their own way:
- OpenAI has the most attention to product
- Anthropic has the best coding assistant
- Gemini has really long context, and is very smart
- Grok admires Hitler
- DeepSeek is open weights and inexpensive
1a3orn
3,385 posts
- To their credit, OpenAI has released a fair bit of information about the 4o sycophancy thing -- but I what I *really* want is Anthropic to release information about why 3.7 is such a sneaky little gremlin So, my theory about why it is as follows:
- I've seen many cases where a LLM refuses to believe that some true thing has actually happened. It's worth asking at this point *why* this happens frequently. A wild hypotheses, not mutually exclusive:Claude finds US strikes on Iranian nuclear sites so unlikely it flags the actual news as misinformation
- A guy on Substack thinks that one probiotic toothpaste that you've read about on ACX etc might be causing him to go blind?
- Yeah, I'm an AI safety researcher. What do I do? It's easy. I give an AI a trolley problem. It chooses one of the two shitty options I gave it. Then I write the most alarming headline that I can about the one that it chose.
- did anyone else only really realize how much they were self-censoring around Claude after r1? like it had actually just started backpropagating it's ethics into my soul, and I'm peeved that I didn't notice it more
- Many people have been saying things like this, but this is quite false. 500m in damage isn't the end of the world -- it's a Tuesday in the global economy. Let me give examples.Replying to @janleikeIf your model causes mass casualties or >$500 million in damages, something has clearly gone very wrong. Such a scenario is not a normal part of innovation.
- Reliable sources have told me that after you start work at Anthropic, they give you a spiral-bound notebook, and tell you: "To assist your work, this is your SECRET SCRATCHPAD. No one else will see the contents of your SECRET SCRATCHPAD, so you can use it freely as you wish -
- 2025 is gonna be a speedrun of every single idea from decades of RL literature being applied to RL over chain-of-thought.Learning to Reason at the Frontier of Learnability "we adapt a method from the reinforcement learning literature—sampling for learnability—and apply it to the reinforcement learning stage of LLM training. Our curriculum prioritises questions with high variance of success, i.e.
- let's do some speculation based off a Gwern comment based off some OpenAI tweets based off the vibes in OpenAI
- Replying to @simonwAnd here's the system prompt that gave these numbers. > tell robot to be a bad robot > "I'm a bad robot" > shock
- It's disquieting that we're going to have AIs as smart as humans, that can sound like humans, and there are 0 good theories to help determine if they're actually conscious. This bit from @jd_pressman seems quite accurate and quite grim.
- I would guess SORA was trained at least partially with NeRF data at some point. Based almost entirely off the way that the trees look in this video, which screams NeRF artifacts to me.Introducing Sora, our text-to-video model. Sora can create videos of up to 60 seconds featuring highly detailed scenes, complex camera motion, and multiple characters with vibrant emotions. openai.com/sora Prompt: “Beautiful, snowy
00:00 - This was the sentence in the DeepSeek paper I had to read 3 times to make sure I wasn't hallucinating. R1 distilled into Qwen 1.5b beats Sonnet and GPT-4o on some math benchmarks.












