VibeThinker-3B: A 3B Dense Reasoning Model Built on Qwen2.5-Coder-3B With the Spectrum-to-Signal Post-Training Pipeline
While recent breakthroughs in AI reasoning have largely been driven by massive scale, pouring in billions of parameters to cross complex cognitive thresholds— VibeThinker-3B is charting a completely different path. Created by researchers from Sina Weibo Inc (China), this 3-billion-parameter model proves that efficiency can punch far above its weight class. Released under an open-source MIT license, VibeThinker-3B matches the performance of models hundreds of times its size on verifiable tasks like mathematics, coding, and STEM disciplines. What is VibeThinker-3B VibeThinker-3B is a compact dense model built on the Qwen2.5-Coder-3B base. It is post-trained, not pretrained from scratch. The research team applies supervised fine-tuning, reinforcement learning, and self-distillation on top. The training framework continues the Spectrum-to-Signal Principle (SSP) from the earlier VibeThinker-1.5B. SFT (Supervised Fine-Tuning) builds a broad space of valid reasoning paths, t...
