Simon Willison’s Weblog

Subscribe

Items in Oct, 2023

Filters: Year: 2023 × Month: Oct × Sorted by date


My User Experience Porting Off setup.py (via) PyOxidizer maintainer Gregory Szorc provides a detailed account of his experience trying to figure out how to switch from setup.py to pyproject.toml for his zstandard Python package.

This kind of detailed usability feedback is incredibly valuable for project maintainers, especially when the user encountered this many different frustrations along the way. It’s like the written version of a detailed usability testing session. # 31st October 2023, 7:57 pm

Our search for the best OCR tool in 2023, and what we found. DocumentCloud’s Sanjin Ibrahimovic reviews the best options for OCR. Tesseract scores highly for easily machine readable text, newcomer docTR is great for ease of use but still not great at handwriting. Amazon Textract is great for everything except non-Latin languages, Google Cloud Vision is great at pretty much everything except for ease-of-use. Azure AI Document Intelligence sounds worth considering as well. # 31st October 2023, 7:21 pm

I’m Sorry I Bit You During My Job Interview. The way this 2011 McSweeney’s piece by Tom O’Donnell escalates is delightful. # 31st October 2023, 4:21 pm

Microsoft announces new Copilot Copyright Commitment for customers. Part of an interesting trend where some AI vendors are reassuring their paying customers by promising legal support in the face of future legal threats:

“As customers ask whether they can use Microsoft’s Copilot services and the output they generate without worrying about copyright claims, we are providing a straightforward answer: yes, you can, and if you are challenged on copyright grounds, we will assume responsibility for the potential legal risks involved.” # 31st October 2023, 3:35 pm

DALL-E 3, GPT4All, PMTiles, sqlite-migrate, datasette-edit-schema

I wrote a lot this week. I also did some fun research into new options for self-hosting vector maps and pushed out several new releases of plugins.

[... 1362 words]

Through the Ages: Apple CPU Architecture (via) I enjoyed this review of Apple’s various CPU migrations—Motorola 68k to PowerPC to Intel x86 to Apple Silicon—by Jacob Bartlett. # 30th October 2023, 10:56 pm

The thing nobody talks about with engineering management is this:

Every 3-4 months every person experiences some sort of personal crisis. A family member dies, they have a bad illness, they get into an argument with another person at work, etc. etc. Sadly, that is just life. Normally after a month or so things settle down and life goes on.

But when you are managing 6+ people it means there is *always* a crisis you are helping someone work through. You are always carrying a bit of emotional burden or worry around with you.

Chris Albon # 27th October 2023, 6:18 am

Making PostgreSQL tick: New features in pg_cron (via) pg_cron adds cron-style scheduling directly to PostgreSQL. It's a pretty mature extension at this point, and recently gained the ability to schedule repeating tasks at intervals as low as every 1s.

The examples in this post are really informative. I like this example, which cleans up the ever-growing cron.job_run_details table by using pg_cron itself to run the cleanup:

SELECT cron.schedule('delete-job-run-details', '0 12 * * *', $$DELETE FROM cron.job_run_details WHERE end_time < now() - interval '3 days'$$);

pg_cron can be used to schedule functions written in PL/pgSQL, which is a great example of the kind of DSL that I used to avoid but I'm now much happier to work with because I know GPT-4 can write basic examples for me and help me understand exactly what unfamiliar code is doing. # 27th October 2023, 2:57 am

Now add a walrus: Prompt engineering in DALL‑E 3

Last year I wrote about my initial experiments with DALL-E 2, OpenAI’s image generation model. I’ve been having an absurd amount of fun playing with its sequel, DALL-E 3 recently. Here are some notes, including a peek under the hood and some notes on the leaked system prompt.

[... 3505 words]

Oh-Auth—Abusing OAuth to take over millions of accounts (via) Describes an attack against vulnerable implementations of OAuth.

Let’s say your application uses OAuth against Facebook, and then takes the returned Facebook token and gives it access to the user account with the matching email address passed in the token from Facebook.

It’s critical that you also confirm the token was generated for your own application, not something else. Otherwise any secretly malicious app online that uses Facebook login could take on of their stored tokens and use it to hijack an account of your site belonging to that user’s email address. # 26th October 2023, 3:51 pm

Execute Jina embeddings with a CLI using llm-embed-jina

Berlin-based Jina AI just released a new family of embedding models, boasting that they are the “world’s first open-source 8K text embedding model” and that they rival OpenAI’s text-embedding-ada-002 in quality.

[... 1392 words]

If a LLM is like a database of millions of vector programs, then a prompt is like a search query in that database [...] this “program database” is continuous and interpolative — it’s not a discrete set of programs. This means that a slightly different prompt, like “Lyrically rephrase this text in the style of x” would still have pointed to a very similar location in program space, resulting in a program that would behave pretty closely but not quite identically. [...] Prompt engineering is the process of searching through program space to find the program that empirically seems to perform best on your target task.

François Chollet # 25th October 2023, 11:26 pm

Web Components Will Outlive Your JavaScript Framework (via) A really clear explanation of the benefit of Web Components built using dependency-free vanilla JavaScript, specifically for interactive components that you might want to embed in something like a blog post. Includes a very neat minimal example component. # 25th October 2023, 5:19 pm

chDB (via) This is a really interesting development: chDB offers “an embedded SQL OLAP Engine” as a Python package, which you can install using “pip install chdb”. What you’re actually getting is a wrapper around ClickHouse—it’s almost like ClickHouse has been repackaged into an embedded database similar to SQLite. # 24th October 2023, 11:04 pm

The real value in evolving as an engineer isn’t solely about amassing a heap of isolated skills but weaving them into an intricate web of abilities that’s greater than the sum of its parts.

Addy Osmani # 24th October 2023, 6:09 am

Embeddings: What they are and why they matter

Embeddings are a really neat trick that often come wrapped in a pile of intimidating jargon.

[... 5835 words]

Weeknotes: PyBay, AI Engineer Summit, Datasette metadata and JavaScript plugins

I’ve had a bit of a slow two weeks in terms of building things and writing code, thanks mainly to a couple of conference appearances. I did review and land a couple of major contributions to Datasette though.

[... 564 words]

Solving the Engineering Strategy Crisis (via) Will Larson’s 49m video discussing engineering strategy: what one is and how to build one. He defines an engineering strategy as having two key components: an honest diagnosis of the way things currently work, and a practical approach to making things better.

Towards the end of the talk he suggests that there are two paths to developing a new strategy. The first is to borrow top-down authority from a sponsor such as a CTO, and the second is to work without any borrowed authority, instead researching how things work at the moment and, through documenting that, write a strategy document into existence! # 22nd October 2023, 9:18 pm

Patrick Newman’s Software Engineering Management Checklist (via) This tiny document may have the highest density of good engineering management advice I’ve ever encountered. # 22nd October 2023, 9:16 pm

New Default: Underlined Links for Improved Accessibility (GitHub Blog). “By default, links within text blocks on GitHub are now underlined. This ensures links are easily distinguishable from surrounding text.” # 19th October 2023, 4:19 pm

I’m banned for life from advertising on Meta. Because I teach Python. (via) If accurate, this describes a nightmare scenario of automated decision making.

Reuven recently found he had a permanent ban from advertising on Facebook. They won’t tell him exactly why, and have marked this as a final decision that can never be reviewed.

His best theory (impossible for him to confirm) is that it’s because he tried advertising a course on Python and Pandas a few years ago which was blocked because a dumb algorithm thought he was trading exotic animals!

The worst part? An appeal is no longer possible because relevant data is only retained for 180 days and so all of the related evidence has now been deleted.

Various comments on Hacker News from people familiar with these systems confirm that this story likely holds up. # 19th October 2023, 2:56 pm

The paradox of ChatGPT is that it is both a step forward beyond graphical user interfaces, because you can ask for anything, not just what’s been built as a feature with a button, but also a step back, because very quickly you have to memorise a bunch of obscure incantations, much like the command lines that GUIs replaced, and remember your ideas for what you wanted to do and how you did it last week

Benedict Evans # 17th October 2023, 11:09 pm

Making CRDTs 98% more efficient (via) Outstanding piece of explanatory writing by Jake Lazaroff showing how he reduced the transmitted state of his pixel art CRDT implementation from 643KB to 15KB using a progression of tricks, each of which is meticulously explained and accompanied by an interactive demo. # 17th October 2023, 5:15 pm

Open questions for AI engineering

Last week I gave the closing keynote at the AI Engineer Summit in San Francisco. I was asked by the organizers to both summarize the conference, summarize the last year of activity in the space and give the audience something to think about by posing some open questions for them to take home.

[... 6928 words]

Multimodality and Large Multimodal Models (LMMs) (via) Useful, extensive review of the current state of the art of multimodal models by Chip Huyen. Chip calls them LMMs for Large Multimodal Models, a term that seems to be catching on. # 14th October 2023, 7:51 pm

Multi-modal prompt injection image attacks against GPT-4V

GPT4-V is the new mode of GPT-4 that allows you to upload images as part of your conversations. It’s absolutely brilliant. It also provides a whole new set of vectors for prompt injection attacks.

[... 889 words]

Wikimedia Commons: Photographs by Gage Skidmore (via) Gage Skidmore is a Wikipedia legend: this category holds 93,458 photographs taken by Gage and released under a Creative Commons license, including a vast number of celebrities taken at events like San Diego Comic-Con. CC licensed photos of celebrities are generally pretty hard to come by so if you see a photo of any celebrity on Wikipedia there’s a good chance it’s credited to Gage. # 10th October 2023, 4:17 am

Bottleneck T5 Text Autoencoder (via) Colab notebook by Linus Lee demonstrating his Contra Bottleneck T5 embedding model, which can take up to 512 tokens of text, convert that into a 1024 floating point number embedding vector... and then then reconstruct the original text (or a close imitation) from the embedding again.

This allows for some fascinating tricks, where you can do things like generate embeddings for two completely different sentences and then reconstruct a new sentence that combines the weights from both. # 10th October 2023, 2:12 am

Claude was trained on data up until December 2022, but may know some events into early 2023.

How up-to-date is Claude's training data? # 9th October 2023, 1:25 am

Decomposing Language Models Into Understandable Components. Anthropic appear to have made a major breakthrough with respect to the interpretability of Large Language Models:

“[...] we outline evidence that there are better units of analysis than individual neurons, and we have built machinery that lets us find these units in small transformer models. These units, called features, correspond to patterns (linear combinations) of neuron activations. This provides a path to breaking down complex neural networks into parts we can understand” # 8th October 2023, 3:43 pm