This collection includes the models used in the paper "Rethinking On-Policy Distillation of Large Language Models: Phenomenology, Mechanism, and Recip
-
lllyx/Qwen3-1.7B-SFT
Text Generation • 2B • Updated • 529 • 4 -
Rethinking On-Policy Distillation of Large Language Models: Phenomenology, Mechanism, and Recipe
Paper • 2604.13016 • Published • 110 -
lllyx/Qwen3-4B-Base-GRPO
Text Generation • 4B • Updated • 374 • 3 -
lllyx/OpenThought3-Qwen3-4B
Viewer • Updated • 305k • 192 • 2