The most surprising part of DeepSeek-R1 is that it only takes ~800k samples of 'good' RL reasoning to convert other models into RL-reasoners. Now that DeepSeek-R1 is available people will be able to refine samples out of it to convert any other model into an RL reasoner.
Recent articles
- Using pip to install a Large Language Model that's under 100MB - 7th February 2025
- OpenAI o3-mini, now available in LLM - 31st January 2025
- A selfish personal argument for releasing code as Open Source - 24th January 2025