user avatar
🇺🇦 Alex Polozov
@Skiminok
Member of Technical Staff @reflection_ai • previously @GoogleDeepMind, @Theteamatx, @MSFTResearch, @uwcse • program synthesis, AI for Code & SWE
New York
Joined March 2009
Posts
  • Pinned
    user avatar
    🎉 Next week, I am excited to join @reflection_ai as a Member of Technical Staff to help build the open intelligence ecosystem of the Western world. It's the most exciting opportunity to help software builders in our time, and will shape many years of AI Engineering in the
  • user avatar
    Hi, and welcome to your system design interview! For the next hour, we'll design a simple microblog service. Requirements: - 1M+ users - 5000 QPS - 1 sec post latency - Can survive a hostile CEO takeover for at least 10 days.
  • user avatar
    "we've optimized ... less for math and computer science competitions, and more for real life" Finally, someone says it out loud!
    Replying to @AnthropicAI
    Claude 3.7 Sonnet is a state-of-the-art model for both coding and agentic tool use. In developing it, we’ve optimized somewhat less for math and computer science competition problems, and instead shifted focus towards real-world tasks that better reflect the needs of our users.
    Performance of different AI models on the SWE-bench Verified benchmark. Claude 3.7 Sonnet significantly outperforms other models with 70.3% accuracy "with custom scaffold" (62.3% base performance), while Claude 3.5 Sonnet, OpenAI o1, OpenAI o3-mini (high), and DeepSeek R1 all show similar performance around 49% accuracy.
  • user avatar
    I bet Google finally feels relieved they never made a popular social network.
  • user avatar
    OpenAI employees: oh no, someone leaked our model architecture details on the Internet! Google employees:
    i might have heard the same 😃 -- I guess info like this is passed around but no one wants to say it out loud. GPT-4: 8 x 220B experts trained with different data/task distributions and 16-iter inference. Glad that Geohot said it out loud. Though, at this point, GPT-4 is
  • user avatar
    So stoked to finally discuss Copilot! I've used it inside MSR for months, watched it evolve, and discussed collabs. [Disclaimer: the tech is by the amazing @github/@OpenAI, I'm an informed observer.] Not exaggerating, Copilot will be in top-3 tech developments of 2020s 🧵👇
    Today, @github, @OpenAI and @Microsoft launched a technical preview of GitHub Copilot. It’s a great example of how advancements in #AI are producing powerful new tools to help developers write better code - and spur more creativity and innovation. copilot.github.com
  • user avatar
    Ending a chapter is always bittersweet. I love Google, I adore DeepMind, and will continue rooting for them in our once-in-a-generation era of building AGI 🚀 To my friends in Gemini, in @julesagent, and in the internal moonshots – you rock and I'll cherish the time working
  • user avatar
    Hey, ML/PL enthusiasts! Looking for some "light" reading for the holiday break? FnT just published our survey on "Neurosymbolic Programming", written jointly with @swarat, Kevin Ellis, @rishabhs, Armando Solar-Lezama, and @yisongyue. nowpublishers.com/article/Detail…
  • user avatar
    I'm just going to go ahead and say it. Working & staying motivated after a year of pandemic + political turmoil + isolation is 𝘀𝘁𝗶𝗹𝗹 𝗵𝗮𝗿𝗱. You might think that me, or your colleagues, or someone else in your attention network breezes through it. That's a lie.
  • user avatar
    ❗ We are hiring! Our Google team is looking for 2 (two) Senior Research Engineers in Large Language Models. Primarily working on code generation, but broad LLM expertise matters most. Here, you get a chance to advance cutting-edge AI research AND ship it to help real users 😉
  • user avatar
    I'm so pissed I'm going to take this rant public. WTF are you trying to accomplish with leaks? Is it just the ego thrill of importance? 100s of Googlers work *hard* to keep publishing & scientific collabs alive. And you just make precedent to silo it all.
  • user avatar
    New blog post: "Program Synthesis in 2017-18". An update on our 18-month old program synthesis survey, highlighting the most interesting research of the last two years. We're not yet close to putting programmers out of jobs, but we're getting there 🙂 alexpolozov.com/blog/program-s…
  • user avatar
    Our team at @Theteamatx has collaborated with @GoogleAI on 🌴​ PaLM – a single 540B-parameter dense language model for multiple domains & tasks, trained over two TPUv4 Pods. PaLM-Coder is an adaptation of PaLM fine-tuned on code and evaluated on software engineering tasks. 1/
    Introducing the 540 billion parameter Pathways Language Model. Trained on two Cloud #TPU v4 pods, it achieves state-of-the-art performance on benchmarks and shows exciting capabilities like mathematical reasoning, code writing, and even explaining jokes. goo.gle/3j6eMnK
    GIF
  • user avatar
    This dude in Rwanda wasn't fond of my selfie 😅