Simon Willison on datasette

1,475 posts tagged “datasette”

Datasette is an open source tool for exploring and publishing data.

2023

Weeknotes: AI won’t slow down, a new newsletter and a huge Datasette refactor

I’m a few weeks behind on my weeknotes, but it’s not through lack of attention to my blog. AI just keeps getting weirder and more interesting.

[... 1,255 words]

11:47 pm / 22nd March 2023 / datasette, generative-ai, projects, ai, weeknotes, llms

Datasette: Gather feedback on new ?_extra= design. I just landed the single biggest backwards-incompatible change to Datasette ever, in preparation for the 1.0 release. It’s a change to the default JSON format from the Datasette API—the new format is much slimmer, and can be expanded using a new ?_extra= query string parameter. I’m desperately keen on getting feedback on this change! This issues has more details and a call for feedback.

# 22nd March 2023, 11:14 pm / json, datasette

Release datasette-atom 0.9 — Datasette plugin that adds a .atom output format

14th Mar 2023, 3:50 am · datasette, atom

Release datasette-simple-html 0.2 — Datasette SQL functions for very simple HTML operations

12th Mar 2023, 5:30 pm · datasette

Release datasette 0.64.2 — An open source multi-tool for exploring and publishing data

8th Mar 2023, 8:46 pm · datasette

Release datasette-simple-html 0.1 — Datasette SQL functions for very simple HTML operations

1st Mar 2023, 2:35 am · datasette

Release datasette-app 0.2.3 — The Datasette macOS application

27th Feb 2023, 6:40 pm · datasette

Using Datasette in GitHub Codespaces. A new Datasette tutorial showing how it can be run inside GitHub Codespaces—GitHub’s browser-based development environments—in order to explore and analyze data. I’ve been using Codespaces to run tutorials recently and it’s absolutely fantastic, because it puts every tutorial attendee on a level playing field with respect to their development environments.

# 24th February 2023, 12:40 am / tutorials, datasette, github, github-codespaces

Release datasette-codespaces 0.1.1 — Conveniences for running Datasette on GitHub Codespaces

23rd Feb 2023, 11:50 pm · datasette

Release datasette-codespaces 0.1 — Conveniences for running Datasette on GitHub Codespaces

22nd Feb 2023, 6:59 pm · datasette

Analytics: Hacker News v.s. a tweet from Elon Musk

My post Bing: “I will not harm you unless you harm me first” really took off.

[... 817 words]

10:11 pm / 17th February 2023 / twitter, bing, datasette, hacker-news, analytics, cloudflare

Release datasette-app-support 0.11.8 — Part of https://github.com/simonw/datasette-app

17th Feb 2023, 2:42 am · datasette

Release datasette-app-support 0.11.7 — Part of https://github.com/simonw/datasette-app

17th Feb 2023, 2:12 am · datasette

Introducing sqlite-vss: A SQLite Extension for Vector Search (via) This latest SQLite extension from Alex Garcia is possibly his best yet: it adds FAISS-powered vector similarity search directly to SQLite, enabling fast KNN similarity lookups against a virtual table that feels a lot like SQLite’s own built-in full text search feature. This write-up includes interactive demos using Datasette called from an Observable notebook, running similarity searches against an index of 200,000 news headlines and summaries in less than 50ms.

# 10th February 2023, 10:53 pm / vector-search, sqlite, datasette, observable, alex-garcia

Weeknotes: A bunch of things I learned this week, plus datasette-explain

The Datasette table view refactor, JSON redesign and ?_extra= continues this week, mainly in this ongoing pull request and this tracking issue.

[... 1,528 words]

5:57 am / 9th February 2023 / gpt-3, sqlite, plugins, webassembly, datasette, generative-ai, projects, weeknotes

Release datasette-explain 0.1a0 — Explain and validate SQL queries as you type them into Datasette

9th Feb 2023, 2:06 am · datasette

Making SQLite extensions pip install-able (via) Alex Garcia figured out how to bundle a compiled SQLite extension in a Python wheel (building different wheels for different platforms) and publish them to PyPI. This is a huge leap forward in terms of the usability of SQLite extensions, which have previously been pretty difficult to actually install and run. Alex also created Datasette plugins that depend on his packages, so you can now “datasette install datasette-sqlite-regex” (or datasette-sqlite-ulid, datasette-sqlite-fastrand, datasette-sqlite-jsonschema) to gain access to his custom SQLite extensions in your Datasette instance. It even works with “datasette publish --install” to deploy to Vercel, Fly.io and Cloud Run.

# 6th February 2023, 7:44 pm / sqlite, plugins, datasette, python, pip, alex-garcia

datasette-scraper, Big Local News and other weeknotes

In addition to exploring the new MusicCaps training and evaluation data I’ve been working on the big Datasette JSON refactor, and getting excited about a Datasette project that I didn’t work on at all.

[... 1,744 words]

2:52 am / 30th January 2023 / projects, weeknotes, datasette, plugins, shot-scraper, colin-dellow

datasette-scraper walkthrough on YouTube (via) datasette-scraper is Colin Dellow’s new plugin that turns Datasette into a powerful web scraping tool, with a web UI based on plugin-driven customizations to the Datasette interface. It’s really impressive, and this ten minute demo shows quite how much it is capable of: it can crawl sitemaps and fetch pages, caching them (using zstandard with optional custom dictionaries for extra compression) to speed up subsequent crawls... and you can add your own plugins to extract structured data from crawled pages and save it to a separate SQLite table!

# 29th January 2023, 5:23 am / scraping, datasette, plugins, colin-dellow

Examples of sites built using Datasette (via) I gave the examples page on the Datasette website a significant upgrade today: it now includes screenshots (taken using shot-scraper) of six projects chosen to illustrate the variety of problems Datasette can be used to tackle.

# 29th January 2023, 3:40 am / projects, shot-scraper, datasette

We’ve built many tools for publishing to the web - but I want to make the claim that we have underdeveloped the tools and platforms for publishing collections, indexes and small databases. It’s too hard to build these kinds of experiences, too hard to maintain them and a lack of collaborative tools.

— Tom Critchlow

# 28th January 2023, 4:43 pm / datasette

Release datasette-render-markdown 2.1.1 — Datasette plugin for rendering Markdown

27th Jan 2023, 11:31 pm · datasette

Exploring MusicCaps, the evaluation data released to accompany Google’s MusicLM text-to-music model

Google Research just released MusicLM: Generating Music From Text. It’s a new generative AI model that takes a descriptive prompt and produces a “high-fidelity” music track. Here’s the paper (and a more readable version using arXiv Vanity).

[... 1,323 words]

9:34 pm / 27th January 2023 / youtube, datasette, google, generative-ai, ai, projects, ethics, training-data, ai-ethics

Release datasette-youtube-embed 0.1 — Turn YouTube URLs into embedded players in Datasette

27th Jan 2023, 8:09 pm · datasette

datasette-granian (via) Granian is a new Python web server—similar to Gunicorn—written in Rust. I built a small plugin that adds a “datasette granian” command starting a Granian server that serves Datasette’s ASGI application, using the same pattern as my existing datasette-gunicorn plugin.

# 20th January 2023, 2:12 am / rust, datasette, asgi

Release datasette-granian 0.1a0 — Run Datasette using the Granian HTTP server

20th Jan 2023, 1:50 am · datasette

Release datasette-faiss 0.2 — Maintain a FAISS index for specified Datasette tables

19th Jan 2023, 11:37 pm · datasette

Datasette is my data hammer (via) Jeremia Kimelman—a data journalist at CalMatters in Sacramento—enthuses about how he uses Datasette as his default hammer for all kinds of data projects—in particular how much he appreciates Datasette’s focus on URLs. So nice to see this!

# 17th January 2023, 5:23 pm / data-journalism, datasette

Weeknotes: AI hacking and a SpatiaLite tutorial

Short weeknotes this time because the key things I worked on have already been covered here:

7:45 pm / 15th January 2023 / gpt-3, datasette, generative-ai, openai, spatialite, ai, weeknotes, vector-search, llms

How to implement Q&A against your documentation with GPT3, embeddings and Datasette

If you’ve spent any time with GPT-3 or ChatGPT, you’ve likely thought about how useful it would be if you could point them at a specific, current collection of text or documentation and have it use that as part of its input for answering questions.

[... 3,447 words]

11:47 pm / 13th January 2023 / generative-ai, gpt-3, sqlite, datasette, projects, search, ai, vector-search, llms, embeddings, rag, ai-assisted-search

«« first « previous page 17 / 50 next » last »»