Weeknotes: Starting 2025 a little slow
4th January 2025
I published my review of 2024 in LLMs and then got into a fight with most of the internet over the phone microphone targeted ads conspiracy theory.
In my last weeknotes I talked about how December in LLMs has been a lot. That was on December 20th, and it turned out there were at least three big new LLM stories still to come before the end of the year:
- OpenAI announced initial benchmarks for their o3 reasoning model, which I covered in a live blog for the last day of their mixed-quality 12 days of OpenAI series. o3 is genuinely impressive.
- Alibaba’s Qwen released their QvQ visual reasoning model, which I ran locally using mlx-vlm. It’s the o1/o3 style trick applied to image prompting and it runs on my laptop.
- DeepSeek—the other big open license Chinese AI lab—shocked everyone by releasing DeepSeek v3 on Christmas day, an open model that compares favorably to the very best closed model and was trained for just $5.6m, 11x less that Meta’s best Llama 3 model, Llama 3.1 405B.
For the second year running I published my review of LLM developments over the past year on December 31st. I’d estimate this took at least four hours of computer time to write and another two of miscellaneous note taking over the past few weeks, but that’s likely an under-estimate.
It went over really well. I’ve had a ton of great feedback about it, both from people who wanted to catch up and from people who have been following the space closely. I even got fireballed!
I’ve had a slower start to 2025 than I had intended. A challenge with writing online is that, like code, writing requires maintenance: any time I drop a popular article I feel obliged to track and participate in any resulting conversations.
Then just as the chatter about my 2024 review started to fade, the Apple Siri microphone settlement story broke and I couldn’t resist publishing I still don’t think companies serve you ads based on spying through your microphone.
Trying to talk people out of believing that conspiracy theory is my toxic trait. I know there’s no point even trying, but I can’t drag myself away.
I think my New Year’s resolution should probably be to spend less time arguing with people on the internet!
Anyway: January is here, and I’m determined to use it to make progress on both Datasette 1.0 and the paid launch of Datasette Cloud.
Blog entries
- I still don’t think companies serve you ads based on spying through your microphone
- Ending a year long posting streak
- Things we learned about LLMs in 2024
- Trying out QvQ—Qwen’s new visual reasoning model
- My approach to running a link blog
- Live blog: the 12th day of OpenAI—“Early evals for OpenAI o3”
TILs
More recent articles
- I still don't think companies serve you ads based on spying through your microphone - 2nd January 2025
- Ending a year long posting streak - 2nd January 2025