28th January 2025
The most surprising part of DeepSeek-R1 is that it only takes ~800k samples of 'good' RL reasoning to convert other models into RL-reasoners. Now that DeepSeek-R1 is available people will be able to refine samples out of it to convert any other model into an RL reasoner.
Recent articles
- Changes in the system prompt between Claude Opus 4.6 and 4.7 - 18th April 2026
- Join us at PyCon US 2026 in Long Beach - we have new AI and security tracks this year - 17th April 2026
- Qwen3.6-35B-A3B on my laptop drew me a better pelican than Claude Opus 4.7 - 16th April 2026