Example dashboard

Various statistics from my blog.

Owned by simonw, visibility: Public

Entries

3084

SQL query
select 'Entries' as label, count(*) as big_number from blog_entry

Blogmarks

7272

SQL query
select 'Blogmarks' as label, count(*) as big_number from blog_blogmark

Quotations

953

SQL query
select 'Quotations' as label, count(*) as big_number from blog_quotation

Chart of number of entries per month over time

SQL query
select '<h2>Chart of number of entries per month over time</h2>' as html
SQL query
select to_char(date_trunc('month', created), 'YYYY-MM') as bar_label,
count(*) as bar_quantity from blog_entry group by bar_label order by count(*) desc

Ten most recent blogmarks (of 7272 total)

SQL query
select '## Ten most recent blogmarks (of ' || count(*) || ' total)' as markdown from blog_blogmark
SQL query
select link_title, link_url, commentary, created from blog_blogmark order by created desc limit 10

10 rows

link_title link_url commentary created
Image resize and quality comparison https://tools.simonwillison.net/image-resize-quality Another tiny tool I built with Claude 3.5 Sonnet and Artifacts. This one lets you select an image (or drag-drop one onto an area) and then displays that same image as a JPEG at 1, 0.9, 0.7, 0.5, 0.3 quality settings, then again but with at half the width. Each image shows its size in KB and can be downloaded directly from the page. <img src="https://static.simonwillison.net/static/2024/image-resize-tool.jpg" alt="Screenshot of the tool, showing a resized photo of a blue heron"> I'm trying to use more images on my blog ([example 1](https://simonwillison.net/2024/Jul/25/button-stealer/), [example 2](https://simonwillison.net/2024/Jul/26/did-you-know-about-instruments/)) and I like to reduce their file size and quality while keeping them legible. The prompt sequence I used for this was: > Build an artifact (no React) that I can drop an image onto and it presents that image resized to different JPEG quality levels, each with a download link Claude produced [this initial artifact](https://claude.site/artifacts/a469a051-6941-4e2f-ba81-f4ef16a2cd33). I followed up with: > change it so that for any image it provides it in the following: > > * original width, full quality > * original width, 0.9 quality > * original width, 0.7 quality > * original width, 0.5 quality > * original width, 0.3 quality > * half width - same array of qualities > > For each image clicking it should toggle its display to full width and then back to max-width of 80% > > Images should show their size in KB Claude produced [this v2](https://claude.site/artifacts/45ecf75e-d8e2-4d2a-a3b9-d8c07c7bd757). I tweaked it a tiny bit (modifying how full-width images are displayed) - the final source code [is available here](https://github.com/simonw/tools/blob/main/image-resize-quality.html). I'm hosting it on my own site which means the Download links work correctly - when hosted on `claude.site` Claude's CSP headers prevent those from functioning. 2024-07-26 13:20:16+00:00
Did you know about Instruments? https://registerspill.thorstenball.com/p/did-you-know-about-instruments Thorsten Ball shows how the macOS Instruments app (installed as part of Xcode) can be used to run a CPU profiler against _any_ application - not just code written in Swift/Objective C. I tried this against a Python process running [LLM](https://llm.datasette.io/) executing a Llama 3.1 prompt with my new [llm-gguf](https://github.com/simonw/llm-gguf) plugin and captured this: ![Screenshot of a deep nested stack trace showing _PyFunction_Vectorcall from python3.10 calling PyCFuncPtr_call _ctypes.cpython-310-darwin.so which then calls ggml_ methods in libggml.dylib](https://static.simonwillison.net/static/2024/instruments-ggml.jpg) 2024-07-26 13:06:38+00:00
Introducing sqlite-lembed: A SQLite extension for generating text embeddings locally https://alexgarcia.xyz/blog/2024/sqlite-lembed-init/index.html Alex Garcia's latest SQLite extension is a C wrapper around the [llama.cpp](https://github.com/ggerganov/llama.cpp) that exposes just its embedding support, allowing you to register a GGUF file containing an embedding model: INSERT INTO temp.lembed_models(name, model) select 'all-MiniLM-L6-v2', lembed_model_from_file('all-MiniLM-L6-v2.e4ce9877.q8_0.gguf'); And then use it to calculate embeddings as part of a SQL query: select lembed( 'all-MiniLM-L6-v2', 'The United States Postal Service is an independent agency...' ); -- X'A402...09C3' (1536 bytes) `all-MiniLM-L6-v2.e4ce9877.q8_0.gguf` here is a 24MB file, so this should run quite happily even on machines without much available RAM. What if you don't want to run the models locally at all? Alex has another new extension for that, described in **[Introducing sqlite-rembed: A SQLite extension for generating text embeddings from remote APIs](https://alexgarcia.xyz/blog/2024/sqlite-rembed-init/index.html)**. The `rembed` is for remote embeddings, and this extension uses Rust to call multiple remotely-hosted embeddings APIs, registered like this: INSERT INTO temp.rembed_clients(name, options) VALUES ('text-embedding-3-small', 'openai'); select rembed( 'text-embedding-3-small', 'The United States Postal Service is an independent agency...' ); -- X'A452...01FC', Blob<6144 bytes> Here's [the Rust code](https://github.com/asg017/sqlite-rembed/blob/v0.0.1-alpha.9/src/clients.rs) that implements Rust wrapper functions for HTTP JSON APIs from OpenAI, Nomic, Cohere, Jina, Mixedbread and localhost servers provided by Ollama and Llamafile. Both of these extensions are designed to complement Alex's [sqlite-vec](https://github.com/asg017/sqlite-vec) extension, which is nearing a first stable release. 2024-07-25 20:30:01+00:00
AI crawlers need to be more respectful https://about.readthedocs.com/blog/2024/07/ai-crawlers-abuse/ Eric Holscher: > At Read the Docs, we host documentation for many projects and are generally bot friendly, but the behavior of AI crawlers is currently causing us problems. We have noticed AI crawlers aggressively pulling content, seemingly without basic checks against abuse. One crawler downloaded 73 TB of zipped HTML files just in Month, racking up $5,000 in bandwidth charges! 2024-07-25 20:02:25+00:00
Button Stealer https://anatolyzenkov.com/stolen-buttons/button-stealer Really fun Chrome extension by Anatoly Zenkov: it scans every web page you visit for things that look like buttons and stashes a copy of them, then provides a page where you can see all of the buttons you have collected. Here's [Anatoly's collection](https://anatolyzenkov.com/stolen-buttons), and here are a few that I've picked up trying it out myself: ![Screenshot showing some buttons I have collected, each with their visual appearance maintained](https://static.simonwillison.net/static/2024/stolen-buttons.jpg) The extension source code is [on GitHub](https://github.com/anatolyzenkov/button-stealer). It identifies potential buttons by looping through every `<a>` and `<button>` element and [applying some heuristics](https://github.com/anatolyzenkov/button-stealer/blob/cfe43b6247e1b9f7d4414fd2a9b122c2d1a40840/scripts/button-stealer.js#L264-L298) like checking the width/height ratio, then [clones a subset of the CSS](https://github.com/anatolyzenkov/button-stealer/blob/cfe43b6247e1b9f7d4414fd2a9b122c2d1a40840/scripts/button-stealer.js#L93-L140) from `window.getComputedStyle()` and stores that in the `style=` attribute. 2024-07-25 19:40:08+00:00
wat https://github.com/igrek51/wat This is a really neat Python debugging utility. Install with `pip install wat-inspector` and then inspect any Python object like this: from wat import wat wat / myvariable The `wat / x` syntax is a shortcut for `wat(x)` that's quicker to type. The tool dumps out all sorts of useful introspection about the variable, value, class or package that you pass to it. There are several variants: `wat.all / x` gives you all of them, or you can chain several together like `wat.dunder.code / x`. The documentation also provides a slightly intimidating copy-paste version of the tool which uses `exec()`, `zlib` and `base64` to help you paste the full implementation directly into any Python interactive session without needing to install it first. 2024-07-25 18:58:27+00:00
Google is the only search engine that works on Reddit now thanks to AI deal https://www.404media.co/google-is-the-only-search-engine-that-works-on-reddit-now-thanks-to-ai-deal/ This is depressing. As of around June 25th [reddit.com/robots.txt](https://www.reddit.com/robots.txt) contains this: User-agent: * Disallow: / Along with a link to Reddit's [Public Content Policy](https://support.reddithelp.com/hc/en-us/articles/26410290525844-Public-Content-Policy). Is this a direct result of Google's deal to license Reddit content for AI training, rumored [at $60 million](https://www.reuters.com/technology/reddit-ai-content-licensing-deal-with-google-sources-say-2024-02-22/)? That's not been confirmed but it looks likely, especially since accessing that `robots.txt` using the [Google Rich Results testing tool](https://search.google.com/test/rich-results) (hence proxied via their IP) appears to return a different file, via [this comment](https://news.ycombinator.com/item?id=41057033#41058375), [my copy here](https://gist.github.com/simonw/be0e8e595178207b1b3dce3b81eacfb3). 2024-07-24 18:29:55+00:00
Mistral Large 2 https://mistral.ai/news/mistral-large-2407/ The second release of a GPT-4 class open weights model in two days, after yesterday's [Llama 3.1 405B](https://simonwillison.net/2024/Jul/23/introducing-llama-31/). The weights for this one are under Mistral's [Research License](https://mistral.ai/licenses/MRL-0.1.md), which "allows usage and modification for research and non-commercial usages" - so not as open as Llama 3.1. You can use it commercially via the Mistral paid API. Mistral Large 2 is 123 billion parameters, "designed for single-node inference" (on a very expensive single-node!) and has a 128,000 token context window, the same size as Llama 3.1. Notably, according to Mistral's own benchmarks it out-performs the much larger Llama 3.1 405B on their code and math benchmarks. They trained on a lot of code: > Following our experience with [Codestral 22B](https://mistral.ai/news/codestral/) and [Codestral Mamba](https://mistral.ai/news/codestral-mamba/), we trained Mistral Large 2 on a very large proportion of code. Mistral Large 2 vastly outperforms the previous Mistral Large, and performs on par with leading models such as GPT-4o, Claude 3 Opus, and Llama 3 405B. They also invested effort in tool usage, multilingual support (across English, French, German, Spanish, Italian, Portuguese, Dutch, Russian, Chinese, Japanese, Korean, Arabic, and Hindi) and reducing hallucinations: > One of the key focus areas during training was to minimize the model’s tendency to “hallucinate” or generate plausible-sounding but factually incorrect or irrelevant information. This was achieved by fine-tuning the model to be more cautious and discerning in its responses, ensuring that it provides reliable and accurate outputs. > > Additionally, the new Mistral Large 2 is trained to acknowledge when it cannot find solutions or does not have sufficient information to provide a confident answer. I went to update my [llm-mistral](https://github.com/simonw/llm-mistral) plugin for LLM to support the new model and found that I didn't need to - that plugin already uses `llm -m mistral-large` to access the `mistral-large-latest` endpoint, and Mistral have updated that to point to the latest version of their Large model. Ollama now have [mistral-large](https://ollama.com/library/mistral-large) quantized to 4 bit as a 69GB download. 2024-07-24 15:56:23+00:00
llm-gguf https://github.com/simonw/llm-gguf I just released a new alpha plugin for [LLM](https://llm.datasette.io/) which adds support for running models from [Meta's new Llama 3.1 family](https://simonwillison.net/2024/Jul/23/introducing-llama-31/) that have been packaged as GGUF files - it should work for other GGUF chat models too. If you've [already installed LLM](https://llm.datasette.io/en/stable/setup.html) the following set of commands should get you setup with Llama 3.1 8B: llm install llm-gguf llm gguf download-model \ https://huggingface.co/lmstudio-community/Meta-Llama-3.1-8B-Instruct-GGUF/resolve/main/Meta-Llama-3.1-8B-Instruct-Q4_K_M.gguf \ --alias llama-3.1-8b-instruct --alias l31i This will download a 4.92GB GGUF from [lmstudio-community/Meta-Llama-3.1-8B-Instruct-GGUF](https://huggingface.co/lmstudio-community/Meta-Llama-3.1-8B-Instruct-GGUF/tree/main) on Hugging Face and save it (at least on macOS) to your `~/Library/Application Support/io.datasette.llm/gguf/models` folder. Once installed like that, you can run prompts through the model like so: llm -m l31i "five great names for a pet lemur" Or use the `llm chat` command to keep the model resident in memory and run an interactive chat session with it: llm chat -m l31i I decided to ship a new alpha plugin rather than update my existing [llm-llama-cpp](https://github.com/simonw/llm-llama-cpp) plugin because that older plugin has some design decisions baked in from the Llama 2 release which no longer make sense, and having a fresh plugin gave me a fresh slate to adopt the latest features from the excellent underlying [llama-cpp-python](https://github.com/abetlen/llama-cpp-python) library by Andrei Betlen. 2024-07-23 22:18:40+00:00
Introducing Llama 3.1: Our most capable models to date https://ai.meta.com/blog/meta-llama-3-1/ We've been waiting for the largest release of the Llama 3 model for a few months, and now we're getting a whole new model family instead. Meta are calling Llama 3.1 405B "the first frontier-level open source AI model" and it really is benchmarking in that GPT-4+ class, competitive with both GPT-4o and Claude 3.5 Sonnet. I'm equally excited by the new 8B and 70B 3.1 models - both of which now support a 128,000 token context and benchmark significantly higher than their Llama 3 equivalents. Same-sized models getting more powerful and capable a very reassuring trend. I expect the 8B model (or variants of it) to run comfortably on an array of consumer hardware, and I've run a 70B model on a 64GB M2 in the past. The 405B model can at least be run on a single server-class node: > To support large-scale production inference for a model at the scale of the 405B, we quantized our models from 16-bit (BF16) to 8-bit (FP8) numerics, effectively lowering the compute requirements needed and allowing the model to run within a single server node. Meta also made a significant [change to the license](https://twitter.com/aiatmeta/status/1815766335219249513): > **We’ve also updated our license** to allow developers to use the outputs from Llama models — including 405B — to improve other models for the first time. > > We’re excited about how this will **enable new advancements in the field through synthetic data generation and model distillation workflows**, capabilities that have never been achieved at this scale in open source. I'm really pleased to see this. Using models to help improve other models has been a crucial technique in LLM research for over a year now, especially for fine-tuned community models release on Hugging Face. Researchers have mostly been ignoring this restriction, so it's reassuring to see the uncertainty around that finally cleared up. Lots more details about the new models in the paper [The Llama 3 Herd of Models](https://ai.meta.com/research/publications/the-llama-3-herd-of-models/) including this somewhat opaque note about the 15 trillion token training data: > Our final data mix contains roughly 50% of tokens corresponding to general knowledge, 25% of mathematical and reasoning tokens, 17% code tokens, and 8% multilingual tokens. **Update**: I got the Llama 3.1 8B Instruct model working with my [LLM](https://llm.datasette.io/) tool via a new plugin, [llm-gguf](https://simonwillison.net/2024/Jul/23/llm-gguf/). 2024-07-23 15:40:47+00:00
Copy and export data

Duration: 9.32ms