mrinank (@MrinankSharma) / X

mrinank

1,367 posts

mrinank

@MrinankSharma

poet // researcher may we each follow our threads everything has to do with loving and not loving -rumi

Joined July 2019

Pinned
mrinank
@MrinankSharma
Feb 9
Today is my last day at Anthropic. I resigned. Here is the letter I shared with my colleagues, explaining my decision.
15M
mrinank
@MrinankSharma
May 31, 2020
Really excited to share our new preprint modelling the effectiveness and surveying the burden of 9 different non-pharmaceutical interventions against #COVID19 transmission, using data collected across 41 different countries. Full Preprint: doi.org/10.1101/2020.0… 1/8
mrinank
@MrinankSharma
Jul 17, 2024
where are the modern day mystics?
12K
mrinank
@MrinankSharma
Mar 13, 2023
Do Bayesian Neural Networks need to be fully stochastic? In our ✨ AISTATS Oral 🌼, we answer with a resounding "no". Partially stochastic networks are - just as expressive - just as principled - and often better performing than more costly fully stochastic networks details👇🏽
24K
mrinank
@MrinankSharma
Apr 3, 2024
we're setting up a new sangha in san francisco for those exploring the teachings of rob burbea, including samadhi, soulmaking, emptiness, and everything else rob offered 🔥 let me know if you'd like to come! share with friends! hope to see you there
6.9K
mrinank
@MrinankSharma
Mar 14, 2024
i passed my phd viva today !!! i give thanks to all of the (countless) beings that supported me and contributed to this <3 thank you @yeewhye @tom_rainforth @eric_nalisnick and so many others <3
10K
mrinank
@MrinankSharma
Sep 12, 2024
i'll be running some MATS projects in the winter around adversarial robustness with @EthanJPerez if you're interested in AI safety research, but looking for mentorship, i really strongly recommend MATS! feel free to DM me if you have questions :-)
Ryan Kidd
@ryan_kidd44
Sep 12, 2024
@MATSprogram Winter 2024-25 mentors include researchers from @AnthropicAI @GoogleDeepMind @aisafetyinst redwoodresearch.org @CNASdc @CHAI_Berkeley @AlgAlignMIT @farairesearch @cais @apolloaisafety @kasl_ai @MIRIBerkeley and more! Apply by Oct 6. matsprogram.org/mentors
24K
mrinank
@MrinankSharma
Sep 17, 2024
come and help us improve adversarial robustness of frontier LLMs at @AnthropicAI as LLMs become more capable, robustness issues will pose larger misuse risks, but as carlini says, the academic community has made "limited progress" so far
FAR.AI
@farairesearch
Sep 17, 2024
"Please learn from our mistakes. Don't do exactly the same things that we did, or you'll end up in ten years with having nothing to show for it." — Nicholas Carlini urging AI researchers to avoid the pitfalls of past adversarial ML research at the Vienna Alignment Workshop 2024.
00:00
8.3K
mrinank
@MrinankSharma
Jul 16, 2024
after years of Metta practice, it is absolutely striking and humbling to see how much hatred and ill will can arise in me
2.3K
mrinank
@MrinankSharma
Feb 3, 2025
New paper: we defend LLMs against universal jailbreaks across thousands of hours of red-teaming. This work happened because of Anthropic's Responsible Scaling Program. We sat down, set an ambitious robustness goal, pivoted to get there, and then executed. Read more below:
Anthropic
@AnthropicAI
Feb 3, 2025
New Anthropic research: Constitutional Classifiers to defend against universal jailbreaks. We’re releasing a paper along with a demo where we challenge you to jailbreak the system.
2.9K
mrinank
@MrinankSharma
Sep 2, 2020
Funded ML PhD positions @ Oxford! if anyone has questions about the AIMS programme including course structure, admissions etc, or any questions about Oxford in general, i'd be more than happy to help, just send me a message :)
Autonomous Intelligent Machines & Systems @Oxford
@aims_oxford
Sep 2, 2020
@aims_oxford @UniofOxford is now open to receive applications for entry in October 2021. aims.robots.ox.ac.uk/study/
mrinank
@MrinankSharma
Oct 23, 2023
really excited to release our paper on understanding sycophancy in language models 🎉 check out the thread for a good summary ✨ this work provides empirical evidence we'll need to go beyond using unaided non-expert human feedback to build reliable ai
Anthropic
@AnthropicAI
Oct 23, 2023
AI assistants are trained to give responses that humans like. Our new paper shows that these systems frequently produce ‘sycophantic’ responses that appeal to users but are inaccurate. Our analysis suggests human feedback contributes to this behavior.
9K
mrinank
@MrinankSharma
Nov 13, 2024
our work on jailbreak rapid response is out! it offers an extremely pragmatic alternative to "achieve perfect robustness" that could mitigate real-world misuse if you're interested in doing research on robustness and misuse, my team is hiring! DM me :-)
Anthropic
@AnthropicAI
Nov 13, 2024
New research: Jailbreak Rapid Response. Ensuring perfect jailbreak robustness is hard. We propose an alternative: adaptive techniques that rapidly block new classes of jailbreak as they’re detected. Read our paper with @MATSprogram: arxiv.org/abs/2411.07494
4.8K
mrinank
@MrinankSharma
Oct 6, 2021
Thrilled to share our latest work—Understanding the effectiveness of government interventions against the resurgence of COVID-19 in Europe, out now in @NatureComms nature.com/articles/s4146… 1/