Sunday, 1st December 2024
Most people don’t have an intuition about what current hardware can and can’t do. There is a simple math that can help you with that: “you can process about 500MB in one second on a single machine”. I know it’s not a universal truth and there are a lot of details that can change that but believe me, this estimation is a pretty good tool to have under your belt.
Turning Your Root URL Into a DuckDB Remote Database. Fun idea from Drew Breunig: DuckDB supports attaching existing databases that are accessible over HTTP using their URL. Drew suggests creating vanity URLs using your root domain, detecting the DuckDB user-agent and serving the database file directly - allowing tricks like this one:
ATTACH 'https://steplist.app/' AS steplist;
SELECT * FROM steplist.lists;
LLM 0.19. I just released version 0.19 of LLM, my Python library and CLI utility for working with Large Language Models.
I released 0.18 a couple of weeks ago adding support for calling models from Python asyncio
code. 0.19 improves on that, and also adds a new mechanism for models to report their token usage.
LLM can log those usage numbers to a SQLite database, or make then available to custom Python code.
My eventual goal with these features is to implement token accounting as a Datasette plugin so I can offer AI features in my SaaS platform without worrying about customers spending unlimited LLM tokens.
Those 0.19 release notes in full:
- Tokens used by a response are now logged to new
input_tokens
andoutput_tokens
integer columns and atoken_details
JSON string column, for the default OpenAI models and models from other plugins that implement this feature. #610llm prompt
now takes a-u/--usage
flag to display token usage at the end of the response.llm logs -u/--usage
shows token usage information for logged responses.llm prompt ... --async
responses are now logged to the database. #641llm.get_models()
andllm.get_async_models()
functions, documented here. #640response.usage()
and async responseawait response.usage()
methods, returning aUsage(input=2, output=1, details=None)
dataclass. #644response.on_done(callback)
andawait response.on_done(callback)
methods for specifying a callback to be executed when a response has completed, documented here. #653- Fix for bug running
llm chat
on Windows 11. Thanks, Sukhbinder Singh. #495
I also released three new plugin versions that add support for the new usage tracking feature: llm-gemini 0.5, llm-claude-3 0.10 and llm-mistral 0.9.