45 items tagged “datasette-cloud”
The SaaS hosted version of Datasette, at datasette.cloud
2024
datasette-llm-usage. I released the first alpha of a Datasette plugin to help track LLM usage by other plugins, with the goal of supporting token allowances - both for things like free public apps that stop working after a daily allowance, plus free previews of AI features for paid-account-based projects such as Datasette Cloud.
It's using the usage features I added in LLM 0.19.
The alpha doesn't do much yet - it will start getting interesting once I upgrade other plugins to depend on it.
Design notes so far in issue #1.
Visualizing local election results with Datasette, Observable and MapLibre GL
Alex Garcia and myself hosted the first Datasette Open Office Hours on Friday—a live-streamed video session where we hacked on a project together and took questions and tips from community members on Discord.
[... 3,390 words]Weeknotes: a staging environment, a Datasette alpha and a bunch of new LLMs
My big achievement for the last two weeks was finally wrapping up work on the Datasette Cloud staging environment. I also shipped a new Datasette 1.0 alpha and added support to the LLM ecosystem for a bunch of newly released models.
[... 1,465 words]Weeknotes: GPT-4o mini, LLM 0.15, sqlite-utils 3.37 and building a staging environment
Upgrades to LLM to support the latest models, and a whole bunch of invisible work building out a staging environment for Datasette Cloud.
[... 730 words]Weeknotes: a livestream, a surprise keynote and progress on Datasette Cloud billing
My first YouTube livestream with Val Town, a keynote at the AI Engineer World’s Fair and some work integrating Stripe with Datasette Cloud. Plus a bunch of upgrades to my blog.
[... 1,124 words]datasette-pins — a new Datasette plugin for pinning tables and queries. Alex Garcia built this plugin for Datasette Cloud, and as with almost every Datasette Cloud features we're releasing it as an open source package as well.
datasette-pins
allows users with the right permission to "pin" tables, databases and queries to their homepage. It's a lightweight way to customize that homepage, especially useful as your Datasette instance grows to host dozens or even hundreds of tables.
AI for Data Journalism: demonstrating what we can do with this stuff right now
I gave a talk last month at the Story Discovery at Scale data journalism conference hosted at Stanford by Big Local News. My brief was to go deep into the things we can use Large Language Models for right now, illustrated by a flurry of demos to help provide starting points for further conversations at the conference.
[... 6,081 words]Three major LLM releases in 24 hours (plus weeknotes)
I’m a bit behind on my weeknotes, so there’s a lot to cover here. But first... a review of the last 24 hours of Large Language Model news. All times are in US Pacific on April 9th 2024.
[... 1,401 words]Extracting data from unstructured text and images with Datasette and GPT-4 Turbo. Datasette Extract is a new Datasette plugin that uses GPT-4 Turbo (released to general availability today) and GPT-4 Vision to extract structured data from unstructured text and images.
I put together a video demo of the plugin in action today, and posted it to the Datasette Cloud blog along with screenshots and a tutorial describing how to use it.
Add ETag header for static responses. I’ve been procrastinating on adding better caching headers for static assets (JavaScript and CSS) served by Datasette for several years, because I’ve been wanting to implement the perfect solution that sets far-future cache headers on every asset and ensures the URLs change when they are updated.
Agustin Bacigalup just submitted the best kind of pull request: he observed that adding ETag support for static assets would side-step the complexity while adding much of the benefit, and implemented it along with tests.
It’s a substantial performance improvement for any Datasette instance with a number of JavaScript plugins... like the ones we are building on Datasette Cloud. I’m just annoyed we didn’t ship something like this sooner!
Weeknotes: the aftermath of NICAR
NICAR was fantastic this year. Alex and I ran a successful workshop on Datasette and Datasette Cloud, and I gave a lightning talk demonstrating two new GPT-4 powered Datasette plugins—datasette-enrichments-gpt and datasette-extract. I need to write more about the latter one: it enables populating tables from unstructured content (using a variant of this technique) and it’s really effective. I got it working just in time for the conference.
[... 1,430 words]Weeknotes: Getting ready for NICAR
Next week is NICAR 2024 in Baltimore—the annual data journalism conference hosted by Investigative Reporters and Editors. I’m running a workshop on Datasette, and I plan to spend most of my time in the hallway track talking to people about Datasette, Datasette Cloud and how the Datasette ecosystem can best help support their work.
[... 1,390 words]dclient 0.3. dclient is my CLI utility for working with remote Datasette instances—in particular for authenticating with them and then running both read-only SQL queries and inserting data using the new Datasette write JSON API. I just picked up work on the project again after a six month gap—the insert command can now be used to constantly stream data directly to hosted Datasette instances such as Datasette Cloud.
GPUs on Fly.io are available to everyone! We’ve been experimenting with GPUs on Fly for a few months for Datasette Cloud. They’re well documented and quite easy to use—any example Python code you find that uses NVIDIA CUDA stuff generally Just Works. Most interestingly of all, Fly GPUs can scale to zero—so while they cost $2.50/hr for a A100 40G (VRAM) and $3.50/hr for a A100 80G you can configure them to stop running when the machine runs out of things to do.
We’ve successfully used them to run Whisper and to experiment with running various Llama 2 LLMs as well.
To look forward to: “We are working on getting some lower-cost A10 GPUs in the next few weeks”.
Weeknotes: datasette-test, datasette-build, PSF board retreat
I wrote about Page caching and custom templates in my last weeknotes. This week I wrapped up that work, modifying datasette-edit-templates to be compatible with the jinja2_environment_from_request() plugin hook. This means you can edit templates directly in Datasette itself and have those served either for the full instance or just for the instance when served from a specific domain (the Datasette Cloud case).
[... 757 words]Weeknotes: Page caching and custom templates for Datasette Cloud
My main development focus this week has been adding public page caching to Datasette Cloud, and exploring what custom template support might look like for that service.
[... 924 words]2023
Annotate and explore your data with datasette-comments. New plugin for Datasette and Datasette Cloud: datasette-comments, providing tools for collaborating on data exploration with a team through posting comments on individual rows of data.
Alex Garcia built this for Datasette Cloud but as with almost all of our work there it’s also available as an open source Python package.
Weeknotes: DevDay, GitHub Universe, OpenAI chaos
Three weeks of conferences and Datasette Cloud work, four days of chaos for OpenAI.
[... 766 words]Financial sustainability for open source projects at GitHub Universe
I presented a ten minute segment at GitHub Universe on Wednesday, ambitiously titled Financial sustainability for open source projects.
[... 2,485 words]Weeknotes: the Datasette Cloud API, a podcast appearance and more
Datasette Cloud now has a documented API, plus a podcast appearance, some LLM plugins work and some geospatial excitement.
[... 1,243 words]Getting started with the Datasette Cloud API. I wrote an introduction to the Datasette Cloud API for the company blog, with a tutorial showing how to use Python and GitHub Actions to import data from the Federal Register into a table in Datasette Cloud, then configure full-text search against it.
Weeknotes: Embeddings, more embeddings and Datasette Cloud
Since my last weeknotes, a flurry of activity. LLM has embeddings support now, and Datasette Cloud has driven some major improvements to the wider Datasette ecosystem.
[... 2,427 words]Introducing datasette-litestream: easy replication for SQLite databases in Datasette. We use Litestream on Datasette Cloud for streaming backups of user data to S3. Alex Garcia extracted out our implementation into a standalone Datasette plugin, which bundles the Litestream Go binary (for the relevant platform) in the package you get when you run “datasette install datasette-litestream”—so now Datasette has a very robust answer to questions about SQLite disaster recovery beyond just the Datasette Cloud platform.
Datasette Cloud and the Datasette 1.0 alphas. I sent out the Datasette Newsletter for the first time in quite a while, with updates on Datasette Cloud, the Datasette 1.0 alphas, a note about the security vulnerability in those alphas and a summary of some of my research into combining LLMs with Datasette.
Datasette Cloud, Datasette 1.0a3, llm-mlc and more
Datasette Cloud is now a significant step closer to general availability. The Datasette 1.03 alpha release is out, with a mostly finalized JSON format for 1.0. Plus new plugins for LLM and sqlite-utils and a flurry of things I’ve learned.
[... 1,690 words]Introducing datasette-write-ui: a Datasette plugin for editing, inserting, and deleting rows. Alex García is working with me on Datasette Cloud for the next few months, graciously sponsored by Fly. We will be working in public, releasing open source code and documenting how to build a multi-tenant SaaS product using Fly Machines.
Alex’s first project is datasette-write-ui, a plugin that finally lets you directly edit data stored inside Datasette. Alex wrote about the plugin on our new Datasette Cloud blog.
Welcome to Datasette Cloud. We launched the Datasette Cloud blog today! The SaaS hosted version of Datasette is ready to start onboarding more users—this post describes what it can do so far and hints at what’s planned to come next.
2022
Designing a write API for Datasette
Building out Datasette Cloud has made one thing clear to me: Datasette needs a write API for ingesting new data into its attached SQLite databases.
[... 1,493 words]Weeknotes: Publishing data using Datasette Cloud
My initial preview releases of Datasette Cloud (the SaaS version of my Datasette open source project) have focused on private data collaboration.
[... 582 words]Weeknotes: Datasette Cloud preview invitations
This week I finally started sending out invitations for people to try out the preview of the new Datasette Cloud, my SaaS offering for Datasette.
[... 713 words]