15th September 2024
[… OpenAI’s o1] could work its way to a correct (and well-written) solution if provided a lot of hints and prodding, but did not generate the key conceptual ideas on its own, and did make some non-trivial mistakes. The experience seemed roughly on par with trying to advise a mediocre, but not completely incompetent, graduate student. However, this was an improvement over previous models, whose capability was closer to an actually incompetent graduate student.
Recent articles
- Claude Fable is relentlessly proactive - 11th June 2026
- Initial impressions of Claude Fable 5 - 9th June 2026
- Running Python code in a sandbox with MicroPython and WASM - 6th June 2026