Publications

Self-Distillation Enables Continual Learning

Idan Shenfeld, Mehul Damani, Jonas Hübotter, Pulkit Agrawal
Published in Arxiv Preprint, 2026

An On-Policy Self-Distillation algorithm that enables on-policy learning from demonstration, thus achieving better performance without catastrophic forgetting compared to SFT.