Archive for June 2022

June 2022

51 posts: 8 entries, 7 links, 2 quotes, 34 beats

June 20, 2022

Joining CSV files in your browser using Datasette Lite

I added a new feature to Datasette Lite—my version of Datasette that runs entirely in your browser using WebAssembly (previously): you can now use it to load one or more CSV files by URL, and then run SQL queries against them—including joins across data from multiple files.

[... 546 words]

9:20 pm / csv, projects, sql, datasette, webassembly, datasette-lite, cors

TIL One-liner for running queries against CSV files with SQLite — I figured out how to run a SQL query directly against a CSV file using the `sqlite3` command-line utility:

20th Jun 2022, 11 pm

June 21, 2022

Sighting 10:06 AM — Cottontail Rabbits, in Pillar Point Harbor, CA, US

21st Jun 2022

Release datasette-scale-to-zero 0.1 — Quit Datasette if it has not received traffic for a specified time period

21st Jun 2022, 10:51 pm · datasette

June 22, 2022

Release datasette-scale-to-zero 0.1.1 — Quit Datasette if it has not received traffic for a specified time period

22nd Jun 2022, 12:07 am · datasette

June 23, 2022

Release datasette-scale-to-zero 0.1.2 — Quit Datasette if it has not received traffic for a specified time period

23rd Jun 2022, 12:06 am · datasette

How Imagen Actually Works. Imagen is Google’s new text-to-image model, similar to (but possibly even more effective than) DALL-E. This article is the clearest explanation I’ve seen of how Imagen works: it uses Google’s existing T5 text encoder to convert the input sentence into an encoding that captures the semantic meaning of the sentence (including things like items being described as being on top of other items), then uses a trained diffusion model to generate a 64x64 image. That image is passed through two super-res models to increase the resolution to the final 1024x1024 output.

# 6:05 pm / google, machine-learning, ai, dalle, generative-ai

Sighting 1:05 PM — Red-tailed Hawk, in Pillar Point Harbor, CA, US

23rd Jun 2022

First impressions of DALL-E, generating images from text

I made it off the DALL-E waiting list a few days ago and I’ve been having an enormous amount of fun experimenting with it. Here are some notes on what I’ve learned so far (and a bunch of example images too).

[... 2,102 words]

11:05 pm / machine-learning, ai, openai, dalle, prompt-engineering, generative-ai, text-to-image

June 27, 2022

Sighting 8:17 PM — Eastern Gray Squirrel, in Capitol Park, CA, US

27th Jun 2022

June 28, 2022

TIL Ignoring errors in a section of a Bash script — For [simonw/museums#32](https://github.com/simonw/museums/issues/32) I wanted to have certain lines in my Bash script ignore any errors: lines that used `sqlite-utils` to add columns and configure FTS, but that might fail with an error if the column already existed or FTS had already been configured.

28th Jun 2022, 12:24 am

The Magic Interview Question (via) Jeff Gothelf explains why “Tell me about the last time you [did something]” is the most valuable question you can ask when interviewing a user or potential user.

# 2:26 pm / usability

The general idea of an “Islands” architecture is deceptively simple: render HTML pages on the server, and inject placeholders or slots around highly dynamic regions. These placeholders/slots contain the server-rendered HTML output from their corresponding widget. They denote regions that can then be "hydrated" on the client into small self-contained widgets, reusing their server-rendered initial HTML.

— Jason Miller

# 3:01 pm / javascript

TIL Running OCR against a PDF file with AWS Textract — [Textract](https://aws.amazon.com/textract/) is the AWS OCR API. It's very good - I've fed it hand-written notes from the 1890s and it read them better than I could.

28th Jun 2022, 7:32 pm

June 29, 2022

Release s3-ocr 0.1a0 — Tools for running OCR against files stored in S3

29th Jun 2022, 2:53 am

Release s3-ocr 0.2a0 — Tools for running OCR against files stored in S3

29th Jun 2022, 7:35 pm

Sighting 1:48 PM — Brown Pelican, in Monterey Bay National Marine Sanctuary, CA, US, CA

29th Jun 2022

June 30, 2022

Release s3-ocr 0.3 — Tools for running OCR against files stored in S3

30th Jun 2022, 12:44 am

Release s3-credentials 0.12 — A tool for creating credentials for accessing S3 buckets

30th Jun 2022, 8:02 pm

Release s3-ocr 0.4 — Tools for running OCR against files stored in S3

30th Jun 2022, 9:03 pm

s3-ocr: Extract text from PDF files stored in an S3 bucket

I’ve released s3-ocr, a new tool that runs Amazon’s Textract OCR text extraction against PDF files in an S3 bucket, then writes the resulting text out to a SQLite database with full-text search configured so you can run searches against the extracted data.

[... 1,493 words]

9:40 pm / aws, ocr, pdf, projects, s3, weeknotes, s3-credentials

«« first « previous page 2 / 2