281 items tagged “sqlite”
SQLite is the world's most widely deployed database engine.
2018
Notes from my appearance on the Changelog podcast
After I spoke at Zeit Day SF last weekend I sat down with Adam Stacoviak to record a 25 minute segment for episode 296 of the Changelog podcast, talking about Datasette. We covered a lot of ground!
[... 536 words]Exploring the UK Register of Members Interests with SQL and Datasette
Ever wondered which UK Members of Parliament get gifted the most helicopter rides? How about which MPs have been given Christmas hampers by the Sultan of Brunei? (David Cameron, William Hague and Michael Howard apparently). Here’s how to dig through the Register of Members Interests using SQL and Datasette.
[... 1,167 words]csvs-to-sqlite 0.8. I released a new version of my csvs-to-sqlite tool this morning with a bunch of handy new features. It can now rename columns and define their types, add the CSV filenames as an additional column, add create indexes on columns and parse dates and datetimes into SQLite-friendly ISO formatted values.
How I made a Who’s On First subset database. Inspired by Paul Ford on Twitter, I tried out a new trick with SQLite: connect to a database containing JSON, attach a brand new empty database file using “attach database”, then populate it using INSERT INTO ... SELECT plus the json_extract() function to extract out a subset of the JSON properties into a new table in the new database.
[On SQLite] The JSON interface is like, "we save the text and when you retrieve it we parse the JSON at several hundred MB/s and let you do path queries against it please stop overthinking it, this is filing cabinet."
SQLite: The Spellfix1 Virtual Table (via) A SQLite extension that lets you create a spellfix1 virtual table which can power “fuzzy” search, by suggesting corrections for misspelled words. I haven’t tried this yet but it looks pretty powerful, including a configurable edit distance and the ability to set up custom “soundslike” terms for words with known unusual spellings.
Analyzing my Twitter followers with Datasette
I decided to do some ad-hoc analsis of my social network on Twitter this afternoon… and since everything is more fun if you bundle it up into a SQLite database and publish it to the internet I performed the analysis using Datasette.
[... 1,314 words]How to turn a list of JSON objects into a Datasette. ramadis on GitHub cleaned up data on 184,879 crimes reported in Buenos Aires since 2016 and shared them on GitHub as a JSON file. Here are my notes on how to use Pandas to convert JSON into SQLite and publish it using Datasette.
How to compile and run the SQLite JSON1 extension on OS X. Thanks, Stack Overflow! I’ve been battling this one for a while—it turns out you can download the SQLite source bundle, compile just the json1.c file using gcc and load that extension in Python’s sqlite3 module (or with Datasette’s --load-extension= option) to gain access to the full suite of SQLite JSON functions—json(), json_extract() etc.
2017
Building a location to time zone API with SpatiaLite, OpenStreetMap and Datasette
Given a latitude and longitude, how can we tell what time zone that point lies within? Here’s how I built a simple JSON API to answer that question, using a combination of data from OpenStreetMap, the SpatiaLite extension for SQLite and my Datasette API tool.
[... 2,185 words]Big Data Workflow with Pandas and SQLite (via) Handy tutorial on dealing with larger data (in this case a 3.9GB CSV file) by incrementally loading it into pandas and writing it out to SQLite.
Many Small Queries Are Efficient In SQLite. Since SQLite runs in-process rather than being accessed over a network it avoids the per-query overhead of network round trips. This means that while MySQL or PostgreSQL applications need to avoid N+1 query patterns that create 100s of queries per request, SQLite apps can be designed differently: provided you hit indexes or small tables, 200 queries just means 200 extra cheap function calls.
SQLite Query Language: WITH clause. SQLite’s documentation on recursive CTEs starts out with some nice clear examples of tree traversal using a WITH statement, then gets into graphs, then goes way off the deep end with a Mandelbrot Set query and a query that can solve Soduku puzzles (“in less than 300 milliseconds on a modern workstation”).
New in Datasette: filters, foreign keys and search
I’ve released Datasette 0.13 with a number of exciting new features (Datasette previously).
[... 1,143 words]harelba/q (via) q is a neat command-line utility that lets you run SQL queries directly against CSV and TSV files. Internally it works by firing up an in-memory SQLite database, and as of the latest release (1.7.1) you can use the new --save-db-to-disk option to save that in-memory database to disk.
If you’re a public data provider—and many large NGOs, government organizations, cultural organizations, historical archives, media organizations, medical orgs, and academic institutions are exactly that—you can publish gigabytes of data, and make it available as an API, and make it easy to browse on the web, too, with extremely low effort. Put it into SQLite, point this little guy at it, and you’ve just radically increased the accessibility and utility of your data. Because messing around in SQL from a web browser is orders of magnitude more immediately useful than downloading a CSV, processing it, and figuring out what comes next.
Datasette: instantly create and publish an API for your SQLite databases
I just shipped the first public version of datasette, a new tool for creating and publishing JSON APIs for SQLite databases.
[... 968 words]simonw/csvs-to-sqlite. I built a simple tool for bulk converting multiple CSV files into a SQLite database.
Pull request #4120 · python/cpython. I just had my first ever change merged into Python! It was a one sentence documentation improvement (on how to cancel SQLite operations) but it was fascinating seeing how Python’s GitHub flow is set up—clever use of labels, plus a bot that automatically checks that you have signed a copy of their CLA.
A Minimalist Guide to SQLite. Pretty comprehensive actually—covers the sqlite3 command line app, importing CSVs, integrating with Python, Pandas and Jupyter notebooks, visualization and more.
System Requirements For SQLite. Document describing the high level goals and objectives of SQLite. Like everything to do with SQLite this exhibits some incredibly well thought out software engineering. I particularly like “S80000: SQLite shall exhibit ductile failure characteristics“ where ductile is described in opposition to brittle: a ductile system begins showing signs of trouble well in advance of failure.
Getting the Most out of Sqlite3 with Python. A couple of neat tricks I didn’t know: you can skip cursors entirely by calling .execute and .executemany directly on the connection object, and you can use the connection object as a context manager to execute transactions using a “with” block.
2010
last.fm for television. Dale Lane’s neat hack to visualise his television watching habits. An Ubuntu / vdx home theatre stores TV events in SQLite, and graphs are generated using Python and Open Flash Chart 2. The really clever bit: the back-end captures nearby bluetooth IDs’ allowing events to be filtered by the people watching based on the presence of their mobile phones.
2009
Browsing my browsing. Roo Reynolds used the MeeTimer Firefox extension to gather statistics on his browsing habits, then extracted data directly from the SQLite database and generated his own graphs using PHP and the canvas element.
2008
iPhone Backup Extractor possibilities (via) Nick Ludlam points out that iTunes backs up your iPhone call records by copying across a sqlite database—which means it wouldn’t be at all hard to extract the logs in to a larger database. Could make for a really cool addition to a private lifestreaming application.
GeoCouch: Geospatial queries with CouchDB. Interesting approach: uses “external2”, a branch that allows external services to be called from CouchDB. SQLite’s SpatiaLite extension is then used as an external spacial index.
asql—Apache SQL querying. Command line tool for loading web server log files in common log format in to a SQLite database, with a built-in interactive shell.
Gearshift. Whoa, a full migrations library written in JavaScript for Gears (which uses SQLite for its data store).
Django Unit Tests and Transactions. If you’re using a transactional database engine (MySQL with InnoDB, Postgres or SQLite) you can speed things up by running each of your unit tests inside a transaction and rolling back in tearDown().
2007
OpenStreetMap on the iPhone! Via an ingenious hack. The Google Maps iPhone client caches downloaded tiles using SQLite—to display your own custom tiles, you just need to dump them straight in to the “cache”.