Simon Willison’s Weblog

Subscribe

June 2022

43 posts: 8 entries, 7 links, 2 quotes, 26 beats

June 23, 2022

Release datasette-scale-to-zero 0.1.2 — Quit Datasette if it has not received traffic for a specified time period

How Imagen Actually Works. Imagen is Google’s new text-to-image model, similar to (but possibly even more effective than) DALL-E. This article is the clearest explanation I’ve seen of how Imagen works: it uses Google’s existing T5 text encoder to convert the input sentence into an encoding that captures the semantic meaning of the sentence (including things like items being described as being on top of other items), then uses a trained diffusion model to generate a 64x64 image. That image is passed through two super-res models to increase the resolution to the final 1024x1024 output.

# 6:05 pm / google, machine-learning, ai, dalle, generative-ai

First impressions of DALL-E, generating images from text

Visit First impressions of DALL-E, generating images from text

I made it off the DALL-E waiting list a few days ago and I’ve been having an enormous amount of fun experimenting with it. Here are some notes on what I’ve learned so far (and a bunch of example images too).

[... 2,102 words]

June 28, 2022

TIL Ignoring errors in a section of a Bash script — For [simonw/museums#32](https://github.com/simonw/museums/issues/32) I wanted to have certain lines in my Bash script ignore any errors: lines that used `sqlite-utils` to add columns and configure FTS, but that might fail with an error if the column already existed or FTS had already been configured.

The Magic Interview Question (via) Jeff Gothelf explains why “Tell me about the last time you [did something]” is the most valuable question you can ask when interviewing a user or potential user.

# 2:26 pm / usability

The general idea of an “Islands” architecture is deceptively simple: render HTML pages on the server, and inject placeholders or slots around highly dynamic regions. These placeholders/slots contain the server-rendered HTML output from their corresponding widget. They denote regions that can then be "hydrated" on the client into small self-contained widgets, reusing their server-rendered initial HTML.

Jason Miller

# 3:01 pm / javascript

TIL Running OCR against a PDF file with AWS Textract — [Textract](https://aws.amazon.com/textract/) is the AWS OCR API. It's very good - I've fed it hand-written notes from the 1890s and it read them better than I could.

June 29, 2022

Release s3-ocr 0.1a0 — Tools for running OCR against files stored in S3
Release s3-ocr 0.2a0 — Tools for running OCR against files stored in S3

June 30, 2022

Release s3-ocr 0.3 — Tools for running OCR against files stored in S3
Release s3-credentials 0.12 — A tool for creating credentials for accessing S3 buckets
Release s3-ocr 0.4 — Tools for running OCR against files stored in S3

s3-ocr: Extract text from PDF files stored in an S3 bucket

Visit s3-ocr: Extract text from PDF files stored in an S3 bucket

I’ve released s3-ocr, a new tool that runs Amazon’s Textract OCR text extraction against PDF files in an S3 bucket, then writes the resulting text out to a SQLite database with full-text search configured so you can run searches against the extracted data.

[... 1,493 words]

2022 » June

MTWTFSS
  12345
6789101112
13141516171819
20212223242526
27282930