user avatar
Toby Shevlane
@tshevl
@_Mantic_AI cofounder & CEO, on a mission to solve forecasting. Prev: research scientist @GoogleDeepMind, PhD at @UniofOxford.
London, England
Joined September 2016
  • Pinned
    user avatar
    I always dreamed of AGI as a wise advisor for humanity. Although LLMs are great for coding & knowledge work, I wouldn’t trust them to give me advice on my career, business strategy, or policy preferences. How can we build AI systems optimized for wisdom? At Mantic we believe the
    00:00
    Mantic used Tinker to RL gpt-oss-120b on judgmental forecasting; the result outperformed frontier models on event predictions. Combined with @_Mantic_AI's forecasting architecture, task-specific training takes us to the cusp of automated superforecasting.
  • user avatar
    Chinchilla and I speaking in a made-up language 😂
  • user avatar
    😅Chinchilla drawing analogies between concepts:
  • user avatar
    I got back from honeymoon last summer and handed in my resignation at DeepMind. My wife thought I was crazy. AI has always been about prediction, but normally we predict small things: a token of text, or moves in chess. The ultimate challenge is to predict the world’s most
    00:00
  • user avatar
    2018 evals, GPT-1: "Our method significantly outperforms the baselines on four of the five datasets" 2023 evals: "Preliminary assessments of GPT-4’s abilities...found it ineffective at autonomously replicating, acquiring resources, and avoiding being shut down “in the wild.”"
  • user avatar
    Personal news: after 2.5 great years @GoogleDeepMind I've left for a new project. I was sad to go but was too excited about the new thing. I don't have much to share yet but in short I persuaded my most cracked friend to leave his job and we're trying to build an LLM
  • user avatar
    Satya Nadella reportedly said about GPT-4: “OpenAI built this with 250 people. Why do we have Microsoft Research at all?" But just as unbelievable: Anthropic built Claude with even fewer people
  • user avatar
    I am so grateful for Bing for saving me from the hackers that were targeting me. This really is an amazing product. 💓
  • user avatar
  • user avatar
    It took some hand-holding, but here is Flamingo 🦩 doing joke explanation:
  • user avatar
  • user avatar
    When it comes to risks from AI, I much prefer focussing on the near-term risks like job losses, surveillance, and misaligned AGI.
  • user avatar
    In 2024, the AI community will develop more capable AI systems than ever before. How do we know what new risks to protect against, and what the stakes are? Our research team at @GoogleDeepMind built a set of evaluations to measure potentially dangerous capabilities: 🧵
  • user avatar
    Thoughts on LLM scaling slowing down: - We should separate "scaling has plateaued" and "the scaling laws still hold but they were always a power law" (see x-axis). - There was nearly 2 years between GPT-3 and GPT-4. The same wait gets us to next month. I think we're not actually