Items tagged datasette, generativeai in 2023
Filters: Year: 2023 × datasette × generativeai × Sorted by date
Weeknotes: the Datasette Cloud API, a podcast appearance and more
Datasette Cloud now has a documented API, plus a podcast appearance, some LLM plugins work and some geospatial excitement.
[... 1243 words]Weeknotes: Self-hosted language models with LLM plugins, a new Datasette tutorial, a dozen package releases, a dozen TILs
A lot of stuff to cover from the past two and a half weeks.
[... 1742 words]What’s in the RedPajama-Data-1T LLM training set
RedPajama is “a project to create leading open-source models, starts by reproducing LLaMA training dataset of over 1.2 trillion tokens”. It’s a collaboration between Together, Ontocord.ai, ETH DS3Lab, Stanford CRFM, Hazy Research, and MILA Québec AI Institute.
[... 1077 words]Weeknotes: AI won’t slow down, a new newsletter and a huge Datasette refactor
I’m a few weeks behind on my weeknotes, but it’s not through lack of attention to my blog. AI just keeps getting weirder and more interesting.
[... 1255 words]Weeknotes: A bunch of things I learned this week, plus datasette-explain
The Datasette table view refactor, JSON redesign and ?_extra=
continues this week, mainly in this ongoing pull request and this tracking issue.
Exploring MusicCaps, the evaluation data released to accompany Google’s MusicLM text-to-music model
Google Research just released MusicLM: Generating Music From Text. It’s a new generative AI model that takes a descriptive prompt and produces a “high-fidelity” music track. Here’s the paper (and a more readable version using arXiv Vanity).
[... 1323 words]Weeknotes: AI hacking and a SpatiaLite tutorial
Short weeknotes this time because the key things I worked on have already been covered here:
[... 477 words]How to implement Q&A against your documentation with GPT3, embeddings and Datasette
If you’ve spent any time with GPT-3 or ChatGPT, you’ve likely thought about how useful it would be if you could point them at a specific, current collection of text or documentation and have it use that as part of its input for answering questions.
[... 3491 words]