Sitan Chen (@sitanch) / X

Sitan Chen

198 posts

Sitan Chen

@sitanch

assistant professor of computer science @hseas, learning theorist, 🎹

Joined April 2020

Pinned
Sitan Chen
@sitanch
Feb 11, 2025
Excited about this new work where we dig into the role of token order in masked diffusions! MDMs train on some horribly hard tasks, but careful planning at inference can sidestep the hardest ones, dramatically improving over vanilla MDM sampling (e.g. 7%->90% acc on Sudoku) 1/
39K
Sitan Chen
@sitanch
Oct 14, 2022
What are noisy intermediate-scale quantum devices good for? In a new paper arxiv.org/abs/2210.07234 joint with @JordanCotler, @RobertHuangHY, and @jerryzli, we define and study a new complexity class, NISQ, that captures the computational power of these devices 🧵 (1/n)
Sitan Chen
@sitanch
Apr 12, 2022
Excited to share something I've been working on over the last year, joint with @jerryzli, Yuanzhi Li, and Anru Zhang! arxiv.org/abs/2204.04209 We give provably efficient algorithms for learning a rich family of "pushforward distributions" inspired by generative models. 1/n
Sitan Chen
@sitanch
Sep 23, 2022
New paper up, joint w/ Sinho Chewi, @jerryzli, Yuanzhi Li, @AdilSlm, and Anru Zhang arxiv.org/abs/2209.11215 We prove diffusion models can efficiently sample from practically any distribution, even highly non-log-concave ones, given reasonably accurate score estimation (1/n)
Sitan Chen
@sitanch
Feb 19, 2024
Proving optimization guarantees for transformers is hard, even if just training on seq2seq pairs for which we know some small transformer achieves zero test loss. In practice gradient descent just works. In theory, it's open to prove *any* efficient algorithm succeeds 🥲 1/
26K
Sitan Chen
@sitanch
Sep 23, 2024
Guidance is one of the key ingredients behind diffusion models' impressive generation capabilities. But what does it actually do? In new work led by @mle_muthu + Khashayar and joint w/ @oldheneel + Jianfeng, we rigorously pin down its behavior in a simple but rich setting 🧵1/
21K
Sitan Chen
@sitanch
Mar 24, 2024
Nice thread on one of my favorite classical physics concepts, the diffraction limit! While Rayleigh's criterion is widely viewed as just a rule of thumb (even by Rayleigh 🧐), Ankur Moitra and I proved it can be regarded as a phase transition for mixture model learning 1/
Peyman Milanfar
@docmilanfar
Mar 24, 2024
What is resolution in an image? It is not the number of pixels. Here’s the classical Rayleigh’s criterion taught in basic physics: 1/5
28K
Sitan Chen
@sitanch
Oct 11, 2024
Excited to announce @JordanCotler, @RobertHuangHY, @jerryzli, and I are organizing a workshop at FOCS on quantum learning ⚛️! There have been a ton of exciting works in this rapidly growing area the last few years, many coming from fruitful interactions between physics+TCS 1/
26K
Sitan Chen
@sitanch
Nov 26, 2024
Excited to announce new work, joint with Kerem Dayi, on training dynamics of LoRA beyond the kernel regime! tl;dr fine-tuning naturally interpolates between NTK and feature learning, and we prove it can behave genuinely differently from either 1/
20K
Sitan Chen
@sitanch
Oct 27, 2024
Jerry Li kicking off our quantum learning workshop at FOCS ‘24 with a TCS-friendly crash course! See the webpage for details on how to tune in remotely: jerryzli.github.io/focs24-worksho…
5.4K
Sitan Chen
@sitanch
May 1, 2023
To appear at ICML ’23 arxiv.org/abs/2303.03384 We obtain non-asymptotic convergence bounds for *deterministic* diffusion model samplers, as well as a new operational interpretation for the probability flow ODE 🏖 1/7
18K
Sitan Chen
@sitanch
Aug 16, 2024
Given copies of unknown quantum state ρ, can we quantify how far it is from being classically simulable ⚛️🎲? Better yet, can we learn the closest approximation by such a state? In new work, we give the first polynomial time algorithm for this problem 1/
9.7K
Sitan Chen
@sitanch
Oct 27, 2022
Turns out the complexity of predicting unknown quantum evolutions is tied to a cute puzzle: given a spectrally bounded linear combo of Paulis, how big can the L_p norm of the coefficients be? 🧐 Check out Robert’s awesome thread on our recent work with @preskill on this problem!
Hsin-Yuan Huang (Robert)
@RobertHuangHY
Oct 27, 2022
🤖 Can a machine *efficiently* learn and predict quantum dynamics with arbitrarily high complexity (e.g, exponentially high)? In our new paper arxiv.org/abs/2210.14894 with @sitanch and @preskill, we give an ML algorithm and prove that it accomplishes this wild task (📜1/7)
Sitan Chen
@sitanch
Mar 7, 2024
Check out this awesome thread by Marvin on our recent work giving theory to understand "critical windows" in diffusion models, a phase transition whereby key features are determined in a narrow window during sampling! Fun blend of math and Stable Diffusion experiments 1/
Marvin Li
@marvin_li03
Mar 7, 2024
In diffusion models, the features of generated images emerge in narrow time intervals of the reverse process 😮—can we provably characterize these “critical windows” in which features are decided? W/ @sitanch we describe critical windows for a rich family of distributions. (1/n)
7.7K