Log inSign up
Jens Tuyls
145 posts
user avatar
Jens Tuyls
@JensTuyls
PhD @PrincetonCS. Previously CS & Eng. @UCIrvine. Studying AI, ML, RL, NLP.
Silicon Valley, CA
jenstuyls.com
Joined June 2016
1,220
Following
931
Followers

New to X?

Sign up now to get your own personalized timeline!

Create account

By signing up, you agree to the Terms of Service and Privacy Policy, including Cookie Use.

Terms·Privacy·Cookies·Accessibility·Ads Info·© 2026 X Corp.
Don't miss what's happening
People on X are the first to know.
Log inSign up
  • Pinned
    user avatar
    Jens Tuyls
    @JensTuyls
    Oct 14, 2025
    Can the knowledge in language model representations guide the search for novel behaviors? We find that exploration with a simple, principled, representation-based bonus improves diversity and pass@k rates for inference-time and post-training!
    22K
  • user avatar
    Jens Tuyls
    @JensTuyls
    Jul 19, 2023
    Imitation learning is one of the most widely used methods in ML, but how does compute affect its performance? We explore this question in the challenging game of NetHack and find our scaled-up agent to outperform prior SOTA by 2x! arxiv.org/abs/2307.09423 [1/6]
    22K
  • user avatar
    Jens Tuyls
    @JensTuyls
    Feb 14, 2022
    How can RL agents deal with both sparse rewards and large, dynamic action spaces – a key challenge in text games? Our method eXploit-Then-eXplore (XTX) tackles these challenges and achieves a more than 2x improvement on Zork! arxiv.org/abs/2201.01251 #ICLR2022 Spotlight 📜[1/5]
  • user avatar
    Jens Tuyls
    @JensTuyls
    Dec 11, 2023
    I’ll be at @NeurIPSConf this week! Feel free to reach out if you’d like to chat about anything scale in RL/IL, language agents (or broadly RL + NLP), or game theory!
    2.8K
  • user avatar
    Jens Tuyls
    @JensTuyls
    Aug 30, 2016
    Loving the new Alexa Skills Kit SDK for Node JS! github.com/alexa/alexa-sk… @alexadevs @amazonecho @AmazonAlexa #amazonecho
  • user avatar
    Jens Tuyls
    @JensTuyls
    Jul 19, 2023
    Replying to @JensTuyls
    See all of this and more in: Scaling Laws for Imitation Learning in NetHack by @JensTuyls, @DhruvMadeka, Kari Torkkola, Dean Foster, @karthik_r_n, @ShamKakade6 Paper: arxiv.org/abs/2307.09423 Project page: coming soon!
    948
  • user avatar
    Jens Tuyls
    @JensTuyls
    Jul 19, 2023
    Replying to @JensTuyls
    More broadly, our results call for work in the larger IL and RL community to more carefully consider the role of scaling laws, which could provide large improvements in many other domains. Also check out prior work by @openai: arxiv.org/abs/2301.13442. [5/6]
    526
  • user avatar
    Jens Tuyls
    @JensTuyls
    Jul 19, 2023
    Replying to @JensTuyls
    We train a suite of neural NetHack agents with different model sizes using Behavioral Cloning (BC) and analyze the loss and mean return isoFLOP profiles. We find both BC loss and mean return to follow clear power law trends with respect to FLOPs. [3/6]
    462
  • user avatar
    Jens Tuyls
    @JensTuyls
    Jul 19, 2023
    Replying to @JensTuyls
    Using these power laws, we forecast the model and data size needed to train an agent aimed at recovering the underlying expert. While our agent falls short of expert performance, it sets a new SOTA (2.7K) in the unsolved game of NetHack, surpassing the prior best by 2x! [4/6]
    368
  • user avatar
    Jens Tuyls
    @JensTuyls
    Jul 19, 2023
    Replying to @JensTuyls
    Prior works have found IL to consistently underperform the data-generating policy. However, these works often overlook the role of compute in terms of model and data size. Inspired by work around LLMs, we see if scaling up IL can provide similar performance gains. [2/6]
    460
  • user avatar
    Jens Tuyls
    @JensTuyls
    Jul 8, 2016
    Black smoke over the bay. What's happening? @ABC @CNN @CBSNews #fireInTheBay
  • user avatar
    Jens Tuyls
    @JensTuyls
    Feb 14, 2022
    Replying to @JensTuyls
    See all of this and more in: Multi-Stage Episodic Control for Strategic Exploration in Text Games By @JensTuyls, @ShunyuYao12, @ShamKakade6, @karthik_r_n Paper: arxiv.org/abs/2201.01251 Project page: sites.google.com/princeton.edu/… Code: github.com/princeton-nlp/…
    arXiv logo
    arxiv.org
    Multi-Stage Episodic Control for Strategic Exploration in Text Games
    Text adventure games present unique challenges to reinforcement learning methods due to their combinatorially large action spaces and sparse rewards. The interplay of these two factors is...
  • user avatar
    Jens Tuyls
    @JensTuyls
    Feb 14, 2022
    Replying to @JensTuyls
    XTX employs a two-stage rollout in each episode to tackle these: (1) An *exploitation* policy trained on promising past trajectories returns to the frontier. (2) An *exploration* policy that uses past experience and curiosity explores the frontier. [3/5]
  • user avatar
    Jens Tuyls
    @JensTuyls
    Feb 14, 2022
    Replying to @JensTuyls
    XTX outperforms several competitive baselines across 12 games in the Jericho benchmark (avg norm. scores across games in fig) in both the deterministic and stochastic setting, showing the strength of our multi-stage approach with strategic exploration at the frontier. [4/5]