QwQ-32B: Embracing the Power of Reinforcement Learning (via) New Apache 2 licensed reasoning model from Qwen:
We are excited to introduce QwQ-32B, a model with 32 billion parameters that achieves performance comparable to DeepSeek-R1, which boasts 671 billion parameters (with 37 billion activated). This remarkable outcome underscores the effectiveness of RL when applied to robust foundation models pretrained on extensive world knowledge.
I've not run this myself yet but I had a lot of fun trying out their previous QwQ reasoning model last November.
LM Studo just released GGUFs ranging in size from 17.2 to 34.8 GB. MLX have compatible weights published in 3bit, 4bit, 6bit and 8bit. Ollama has the new qwq too - it looks like they've renamed the previous November release qwq:32b-preview.
Recent articles
- I built an automaton called Squadron - 4th March 2025
- Notes from my Accessibility and Gen AI podcast appearance - 2nd March 2025
- Hallucinations in code are the least dangerous form of LLM mistakes - 2nd March 2025