Simon Willison’s Weblog

Subscribe

Saturday, 23rd November 2024

Quantization matters (via) What impact does quantization have on the performance of an LLM? been wondering about this for quite a while, now here are numbers from Paul Gauthier.

He ran differently quantized versions of Qwen 2.5 32B Instruct through his Aider code editing benchmark and saw a range of scores.

The original released weights (BF16) scored highest at 71.4%, with Ollama's qwen2.5-coder:32b-instruct-fp16 (a 66GB download) achieving the same score.

The quantized Ollama qwen2.5-coder:32b-instruct-q4_K_M (a 20GB download) saw a massive drop in quality, scoring just 53.4% on the same benchmark.

# 6:39 pm / ai, generative-ai, edge-llms, llms, aider, ollama

If you try and tell people 5 interesting things about your product / company / cause, they’ll remember zero. If instead, you tell them just one, they’ll usually ask questions that lead them to the other things, and then they’ll remember all of them because it mattered to them at the moment they asked.

James Dillard

# 6:47 pm / entrepreneurship, startups

Importing a frontend Javascript library without a build system. I sometimes think the hardest problem in computer science right now is taking an NPM library and figuring out how to download it and use it from a <script> tag without needing to involve some sort of convoluted build system.

Julia Evans shares my preference for build-free JavaScript, and has shared notes about figuring out how to turn an arbitrary NPM package into something that can be loaded in a browser.

It's so complicated! This is the best exploration I've seen yet of the topic but wow, this really needs to be easier.

My download-esm tool gets a mention, but I have to admit I'm not 100% confident in that as a robust solution. I don't know nearly enough about the full scope of the problem here to confidently recommend my own tool!

Right now my ideal solution would turn almost anything from NPM into an ES module that I can self-host and then load using import ... from in a <script type="module"> block, maybe with an importmap as long as I don't have to think too hard about what to put in it.

I'm intrigued by esm.sh (mentioned by Julia as a new solution worth exploring). The length of the documentation on that page further reinforces quite how much there is that I need to understand here.

# 7:18 pm / javascript, npm, julia-evans