Self-Distillation Enables Continual Learning
Idan Shenfeld, Mehul Damani, Jonas Hübotter, Pulkit Agrawal
Published in Arxiv Preprint, 2026
An On-Policy Self-Distillation algorithm that enables on-policy learning from demonstration, thus achieving better performance without catastrophic forgetting compared to SFT.
