Simon Willison’s Weblog

Subscribe

Entries tagged datasette

Filters: Type: entry × datasette × Sorted by date


Weeknotes: more datasette-secrets, plus a mystery video project

I introduced datasette-secrets two weeks ago. The core idea is to provide a way for end-users to store secrets such as API keys in Datasette, allowing other plugins to access them.

[... 982 words]

Weeknotes: Llama 3, AI for Data Journalism, llm-evals and datasette-secrets

Llama 3 landed on Thursday. I ended up updating a whole bunch of different plugins to work with it, described in Options for accessing Llama 3 from the terminal using LLM.

[... 1030 words]

AI for Data Journalism: demonstrating what we can do with this stuff right now

I gave a talk last month at the Story Discovery at Scale data journalism conference hosted at Stanford by Big Local News. My brief was to go deep into the things we can use Large Language Models for right now, illustrated by a flurry of demos to help provide starting points for further conversations at the conference.

[... 6080 words]

Weeknotes: the aftermath of NICAR

NICAR was fantastic this year. Alex and I ran a successful workshop on Datasette and Datasette Cloud, and I gave a lightning talk demonstrating two new GPT-4 powered Datasette plugins—datasette-enrichments-gpt and datasette-extract. I need to write more about the latter one: it enables populating tables from unstructured content (using a variant of this technique) and it’s really effective. I got it working just in time for the conference.

[... 1430 words]

Weeknotes: Getting ready for NICAR

Next week is NICAR 2024 in Baltimore—the annual data journalism conference hosted by Investigative Reporters and Editors. I’m running a workshop on Datasette, and I plan to spend most of my time in the hallway track talking to people about Datasette, Datasette Cloud and how the Datasette ecosystem can best help support their work.

[... 1390 words]

Weeknotes: a Datasette release, an LLM release and a bunch of new plugins

I wrote extensive annotated release notes for Datasette 1.0a8 and LLM 0.13 already. Here’s what else I’ve been up to this past three weeks.

[... 1074 words]

Datasette 1.0a8: JavaScript plugins, new plugin hooks and plugin configuration in datasette.yaml

I just released Datasette 1.0a8. These are the annotated release notes.

[... 1709 words]

Weeknotes: datasette-test, datasette-build, PSF board retreat

I wrote about Page caching and custom templates in my last weeknotes. This week I wrapped up that work, modifying datasette-edit-templates to be compatible with the jinja2_environment_from_request() plugin hook. This means you can edit templates directly in Datasette itself and have those served either for the full instance or just for the instance when served from a specific domain (the Datasette Cloud case).

[... 757 words]

Weeknotes: Page caching and custom templates for Datasette Cloud

My main development focus this week has been adding public page caching to Datasette Cloud, and exploring what custom template support might look like for that service.

[... 924 words]

Weeknotes: datasette-enrichments, datasette-comments, sqlite-chronicle

I’ve mainly been working on Datasette Enrichments and continuing to explore the possibilities enabled by sqlite-chronicle.

[... 1123 words]

Datasette Enrichments: a new plugin framework for augmenting your data

Today I’m releasing datasette-enrichments, a new feature for Datasette which provides a framework for applying “enrichments” that can augment your data.

[... 1202 words]

Weeknotes: DevDay, GitHub Universe, OpenAI chaos

Three weeks of conferences and Datasette Cloud work, four days of chaos for OpenAI.

[... 766 words]

Financial sustainability for open source projects at GitHub Universe

I presented a ten minute segment at GitHub Universe on Wednesday, ambitiously titled Financial sustainability for open source projects.

[... 2485 words]

Weeknotes: PyBay, AI Engineer Summit, Datasette metadata and JavaScript plugins

I’ve had a bit of a slow two weeks in terms of building things and writing code, thanks mainly to a couple of conference appearances. I did review and land a couple of major contributions to Datasette though.

[... 564 words]

Weeknotes: the Datasette Cloud API, a podcast appearance and more

Datasette Cloud now has a documented API, plus a podcast appearance, some LLM plugins work and some geospatial excitement.

[... 1243 words]

Weeknotes: Embeddings, more embeddings and Datasette Cloud

Since my last weeknotes, a flurry of activity. LLM has embeddings support now, and Datasette Cloud has driven some major improvements to the wider Datasette ecosystem.

[... 2427 words]

Datasette 1.0a4 and 1.0a5, plus weeknotes

Two new alpha releases of Datasette, plus a keynote at WordCamp, a new LLM release, two new LLM plugins and a flurry of TILs.

[... 2709 words]

Datasette Cloud, Datasette 1.0a3, llm-mlc and more

Datasette Cloud is now a significant step closer to general availability. The Datasette 1.03 alpha release is out, with a mostly finalized JSON format for 1.0. Plus new plugins for LLM and sqlite-utils and a flurry of things I’ve learned.

[... 1690 words]

Weeknotes: Plugins for LLM, sqlite-utils and Datasette

The principle theme for the past few weeks has been plugins.

[... 1203 words]

Weeknotes: Self-hosted language models with LLM plugins, a new Datasette tutorial, a dozen package releases, a dozen TILs

A lot of stuff to cover from the past two and a half weeks.

[... 1742 words]

Weeknotes: Parquet in Datasette Lite, various talks, more LLM hacking

I’ve fallen a bit behind on my weeknotes. Here’s a catchup for the last few weeks.

[... 769 words]

Data analysis with SQLite and Python for PyCon 2023

I’m at PyCon 2023 in Salt Lake City this week.

[... 347 words]

What’s in the RedPajama-Data-1T LLM training set

RedPajama is “a project to create leading open-source models, starts by reproducing LLaMA training dataset of over 1.2 trillion tokens”. It’s a collaboration between Together, Ontocord.ai, ETH DS3Lab, Stanford CRFM, Hazy Research, and MILA Québec AI Institute.

[... 1077 words]

sqlite-history: tracking changes to SQLite tables using triggers (also weeknotes)

In between blogging about ChatGPT rhetoric, micro-benchmarking with ChatGPT Code Interpreter and Why prompt injection is an even bigger problem now I managed to ship the beginnings of a new project: sqlite-history.

[... 1680 words]

Weeknotes: A new llm CLI tool, plus automating my weeknotes and newsletter

I started publishing weeknotes in 2019 partly as a way to hold myself accountable but mainly as a way to encourage myself to write more.

[... 830 words]

Semi-automating a Substack newsletter with an Observable notebook

I recently started sending out a weekly-ish email newsletter consisting of content from my blog. I’ve mostly automated that, using an Observable Notebook to generate the HTML. Here’s how that system works.

[... 2520 words]

I built a ChatGPT plugin to answer questions about data hosted in Datasette

Yesterday OpenAI announced support for ChatGPT plugins. It’s now possible to teach ChatGPT how to make calls out to external APIs and use the responses to help generate further answers in the current conversation.

[... 1801 words]

Weeknotes: AI won’t slow down, a new newsletter and a huge Datasette refactor

I’m a few weeks behind on my weeknotes, but it’s not through lack of attention to my blog. AI just keeps getting weirder and more interesting.

[... 1255 words]

Analytics: Hacker News v.s. a tweet from Elon Musk

My post Bing: “I will not harm you unless you harm me first” really took off.

[... 817 words]

Weeknotes: A bunch of things I learned this week, plus datasette-explain

The Datasette table view refactor, JSON redesign and ?_extra= continues this week, mainly in this ongoing pull request and this tracking issue.

[... 1528 words]