I’ve been at OpenAI for almost a year now. In that time, I’ve trained a lot of generative models. [...] It’s becoming awfully clear to me that these models are truly approximating their datasets to an incredible degree. [...] What this manifests as is – trained on the same dataset for long enough, pretty much every model with enough weights and training time converges to the same point. [...] This is a surprising observation! It implies that model behavior is not determined by architecture, hyperparameters, or optimizer choices. It’s determined by your dataset, nothing else. Everything else is a means to an end in efficiently delivery compute to approximating that dataset.
Recent articles
- Recreating the Apollo AI adoption rate chart with GPT-5, Python and Pyodide - 9th September 2025
- GPT-5 Thinking in ChatGPT (aka Research Goblin) is shockingly good at search - 6th September 2025
- V&A East Storehouse and Operation Mincemeat in London - 27th August 2025