TL;DR
ByteDance open-sources Stable-DiffCoder, an 8B diffusion-based code model that outperforms autoregressive baselines on code generation, completion, and editing tasks.
Key Points
- Introduces block diffusion continual pretraining (CPT) with tailored warmup and block-wise clipped noise schedule
- Achieves state-of-the-art performance among open-source 8B models across code generation, completion, editing, and reasoning benchmarks
- Uses identical architecture and data as AR baseline for fair comparison, isolating diffusion training's impact
- Publicly available on Hugging Face with three model variants; MIT licensed
Why It Matters
Demonstrates that diffusion-based training can improve code modeling quality beyond autoregressive approaches, even with controlled parameters. Provides actionable training insights and a production-ready alternative for developers building code generation systems at scale.
Source: github.com