I am a fourth-year Ph.D. student at MIT advised by Professor Pulkit Agrawal. My research focus is reinforcement learning algorithms and their applications, mainly in NLP and robotics. My research has been funded by Qualcomm Innovation Fellowship and MIT-Google Collaboration Grant.

I was most recently at Google DeepMind working on post-training for LLMs. I have a B.Sc. in Computer Engineering from the Technion. While there I worked with Aviv Tamar on reinforcement learning research.

News

All Recent
[May 2026]SDFT won the Best Paper Award at the Lifelong Agent Workshop, ICLR 2026.
[Apr 2026]Invited talk at IBM Research
[Apr 2026]Invited talk at UT Austin
[Apr 2026]Interview at Deep Learning with Yacine [Link]
[Mar 2026]Invited talk at FAIR
[Dec 2025]RL's Razor won the Outstanding Paper Award at the CCFM Workshop, NeurIPS 2025
[Jul 2025]Language Model Personalization via Reward Factorization will be presented as an oral in MoFA Workshop, ICML 2025
[Sep 2024]Awarded the Qualcomm 2024 Innovation Fellowship
[Jun 2024]I'm joining Google DeepMind as a student researcher for summer 2024
[May 2024]I will give an oral presentation in ICLR 2024 R2-FM workshop on our Value Augmented Sampling paper
[Jan 2024]Curiosity-driven Red-teaming for Large Language Models got accepted to ICLR 2024
[Sep 2024]Teaching Assistant for Computational Sensorimotor Learning Course.
[Sep 2023]Invited talk at Hyundai Research.
[Jun 2023]Invited talk at the Technion.
[Apr 2023]Our TGRL paper got accepted to ICML2023, see you in Hawaii!

Publications

All Selected Publications

Aligning Language Models From User Interactions

Thomas Kleine Buening, Jonas Hübotter, Barna Pásztor, Idan Shenfeld, Giorgia Ramponi, Andreas Krause

Arxiv Preprint, 2026

Reaching Beyond the Mode: RL for Distributional Reasoning in Language Models

Isha Puri, Mehul Damani, Idan Shenfeld, Marzyeh Ghassemi, Jacob Andreas, Yoon Kim

ICML, 2026

Self-Distillation Enables Continual Learning

Idan Shenfeld, Mehul Damani, Jonas Hübotter, Pulkit Agrawal

ICML, 2026

Best Paper Award at Lifelong Agents Workshop, ICLR 2026

Reinforcement Learning via Self-Distillation

Jonas Hübotter, Frederike Lübeck,... Idan Shenfeld,... Andreas Krause

ICML, 2026

RL's Razor: Why Online Reinforcement Learning Forgets Less

Idan Shenfeld, Jyothish Pari, Pulkit Agrawal

ICLR, 2026

Outstanding Paper Award at the CCFM Workshop, NeurIPS 2025

Beyond Binary Rewards: Training LMs to Reason About Their Uncertainty

Mehul Damani, Isha Puri, Stewart Slocum, Idan Shenfeld, Leshem Choshen, Yoon Kim, Jacob Andreas

ICLR, 2026

Best-of-n through the Smoothing Lens: KL Divergence and Regret Analysis

Gholamali Aminian, Idan Shenfeld, Amir Asadi, Ahmad Beirami, Youssef Mroueh

ICLR, 2026

Language Model Personalization via Reward Factorization

Idan Shenfeld, Felix Faltings, Pulkit Agrawal, Aldo Pacchiano

COLM, 2025

KL-Regularized RLHF with Multiple Reference Models: Exact Solutions and Sample Complexity

Gholamali Aminian, Amir Asadi, Idan Shenfeld, Youssef Mroueh

NeurIPS, 2025

Learning How Hard to Think: Input-Adaptive Allocation of LM Computation

Mehul Damani, Idan Shenfeld, Andi Peng, Andreea Bobu, Jacob Andreas

ICLR, 2025

The Future of Open Human Feedback

Shachar Don-Yehiya, Ben Burtenshaw,... Idan Shenfeld ..., Leshem Choshen

Nature Machine Intelligence, 2025

Value Augmented Sampling for Language Model Alignment and Personalization

Idan Shenfeld, Seungwook Han, Akash Srivastava, Yoon Kim, Pulkit Agrawal

Oral presentation at Workshop on Reliable and Responsible Foundation Models, ICLR 2024

Curiosity-driven Red-teaming for Large Language Models

Zhang-Wei Hong, Idan Shenfeld, Tsun-Hsuan Wang, Yung-Sung Chuang, Aldo Pareja, James R. Glass, Akash Srivastava, Pulkit Agrawal

ICLR, 2024

From Imitation to Refinement: Residual RL for Precise Visual Assembly

Lars Lien Ankile, Anthony Simeonov, Idan Shenfeld, Marcel Torne Villasevil, Pulkit Agrawal

ICRA, 2025

Juicer: Data-efficient Imitation Learning for Robotic Assembly

Lars Ankile, Anthony Simeonov, Idan Shenfeld, Pulkit Agrawal

IROS, 2024

TGRL: An Algorithm for Teacher Guided Reinforcement Learning

Idan Shenfeld, Zhang-Wei Hong, Aviv Tamar, and Pulkit Agrawal

ICML, 2023

Selected for Oral Presentation at 2023 ICLR RRL Workshop.

Offline Meta Reinforcement Learning - Identifiability Challenges and Effective Data Collection Strategies

Ron Dorfman, Idan Shenfeld, and Aviv Tamar

NeurIPS, 2021