28th January 2025
The most surprising part of DeepSeek-R1 is that it only takes ~800k samples of 'good' RL reasoning to convert other models into RL-reasoners. Now that DeepSeek-R1 is available people will be able to refine samples out of it to convert any other model into an RL reasoner.
Recent articles
- Publishing WASM wheels to PyPI for use with Pyodide - 13th June 2026
- Claude Fable is relentlessly proactive - 11th June 2026
- Initial impressions of Claude Fable 5 - 9th June 2026