Simon Willison’s Weblog

115 items tagged “json”


dolthub/jsplit (via) Neat Go CLI tool for working with truly gigantic JSON files. This assumes files will be an object with one or more keys that are themselves huge lists of objects—it than extracts those lists out into one or more newline-delimited JSON files (capping their size at 4GB) which are much easier to work with as streams of data. # 6th September 2022, 8:27 pm

Introducing sqlite-lines—a SQLite extension for reading files line-by-line (via) Alex Garcia wrote a brilliant C module for SQLIte which adds functions (and a table-valued function) for efficiently reading newline-delimited text into SQLite. When combined with SQLite’s built-in JSON features this means you can read a huge newline-delimited JSON file into SQLite in a streaming fashion so it doesn’t exhaust memory for a large file. Alex also compiled the extension to WebAssembly, and his post here is an Observable notebook post that lets you exercise the code directly. # 30th July 2022, 7:18 pm

jless (via) A really nice new command-line JSON viewer, written in Rust, created by Paul Julius Martinez. It provides a terminal interface for navigating through large JSON files, including expanding and contracting nested objects and searching for strings or a modified form of regular expressions. # 12th February 2022, 3:17 am


jc (via) This is such a great idea: jc is a CLI tool which knows how to convert the output of dozens of different classic Unix utilities to JSON, so you can more easily process it programmatically, pipe it through jq and suchlike. “pipx install jc” to install, then “dig | jc --dig” to try it out. # 5th December 2021, 11:05 pm

lex.go in json5-go. This archived GitHub repository has a beautifully clean and clear example of a hand-written lexer in Go, for the JSON5 format (JSON + comments + multi-line strings). parser.go is worth a look too. # 19th August 2021, 8:15 pm

Joining CSV and JSON data with an in-memory SQLite database

The new sqlite-utils memory command can import CSV and JSON data directly into an in-memory SQLite database, combine and query it using SQL and output the results as CSV, JSON or various other formats of plain text tables.

[... 1507 words]

Understanding JSON Schema (via) Useful, comprehensive short book guide to JSON Schema, which finally helped me feel like I fully understand the specification. # 24th March 2021, 2:57 am


airtable-export. I wrote a command-line utility for exporting data from Airtable and dumping it to disk as YAML, JSON or newline delimited JSON files. This means you can backup an Airtable database from a GitHub Action and get a commit history of changes made to your data. # 29th August 2020, 9:48 pm

Better Python Object Serialization. TIL about functions.singledispatch, a decorator which makes it easy to create Python functions with implementations that vary based on the type of their arguments and which can have additional implementations registered after the fact—great for things like custom JSON serialization. # 7th January 2020, 8:35 pm


Subsume JSON a.k.a. JSON ⊂ ECMAScript (via) TIL that JSON isn’t a subset of ECMAScript after all! “In ES2018, ECMAScript string literals couldn’t contain unescaped U+2028 LINE SEPARATOR and U+2029 PARAGRAPH SEPARATOR characters, because they are considered to be line terminators even in that context.” # 15th August 2019, 10:30 am

json-flatten. A little Python library I wrote that attempts to flatten a JSON object into a set of key/value pairs suitable for transmitting in a query string or using to construct an HTML form. I first wrote this back in 2015 as a Gist—I’ve reconstructed the Gist commit history in a new repository and shipped it to PyPI. # 22nd June 2019, 4:51 am

paginate-json (via) I released a fun tiny utility: paginate-json, which knows how to paginate through JSON APIs that use the HTTP Link header for pagination. I built it so I could pull data from the GitHub API and pipe it directly into SQLite via sqlite-utils. # 12th June 2019, 3:22 pm

quicktype code generator for Python. Really interesting tool: give it an example JSON document and it will code-generate the equivalent set of Python classes (with type annotations) instantly in your browser. It also accepts input in JSON Schema or TypeScript and can generate code in 18 different languages. # 14th May 2019, 11:35 pm

zson (via) “ZSON is a PostgreSQL extension for transparent JSONB compression. Compression is based on a shared dictionary of strings most frequently used in specific JSONB documents [...] In some cases ZSON can save half of your disk space and give you about 10% more TPS.” # 2nd April 2019, 9:26 pm

Datasette 0.27 (via) The latest release of Datasette introduces an option to output tables and SQL query results as newline-delimited JSON—plus a new “datasette plugins” command for listing available plugins. # 1st February 2019, 4:39 am


jq recipes. Remy Sharp’s handy collection of jq recipes, each one linking to an interactive demo on I thought jq was just for extracting values from a JSON document—I hadn’t realized how powerful it was for modifying and extending those documents as well. # 22nd August 2018, 3:23 pm

JSON Escape Text. I built a tiny tool for turning text into an escaped JSON string—I needed it to help create descriptions and canned SQL queries for adding to Datasette’s metadata.json files. # 25th April 2018, 4:13 am

react-jsonschema-form. Exciting library from the Mozilla Services team: given a JSON Schema definition, react-jsonschema-form can produce a cleanly designed React-powered form for adding and editing data that matches that schema. Includes support for adding multiple items in a nested array, re-ordering them, custom form widgets and more. # 23rd April 2018, 9:38 pm

gron. Ingenious tool for working with JSON on the command line: run “gron URL/filepath” to transform a JSON document into a multi-line assignment structure designed to be easy to run grep against. Grep it, then pipe it back into “gron --ungron” to convert the filtered data back to JSON again. It solves a similar problem to jq—which is addressed in the README: “gron’s primary purpose is to make it easy to find the path to a value in a deeply nested JSON blob when you don’t already know the structure; much of jq’s power is unlocked only once you know that structure”. # 3rd April 2018, 9:16 pm

How I made a Who’s On First subset database. Inspired by Paul Ford on Twitter, I tried out a new trick with SQLite: connect to a database containing JSON, attach a brand new empty database file using “attach database”, then populate it using INSERT INTO ... SELECT plus the json_extract() function to extract out a subset of the JSON properties into a new table in the new database. # 3rd February 2018, 5:25 am

[On SQLite] The JSON interface is like, “we save the text and when you retrieve it we parse the JSON at several hundred MB/s and let you do path queries against it please stop overthinking it, this is filing cabinet.”

Paul Ford # 29th January 2018, 4:29 pm

How to turn a list of JSON objects into a Datasette. ramadis on GitHub cleaned up data on 184,879 crimes reported in Buenos Aires since 2016 and shared them on GitHub as a JSON file. Here are my notes on how to use Pandas to convert JSON into SQLite and publish it using Datasette. # 20th January 2018, 1:07 am

How to compile and run the SQLite JSON1 extension on OS X. Thanks, Stack Overflow! I’ve been battling this one for a while—it turns out you can download the SQLite source bundle, compile just the json1.c file using gcc and load that extension in Python’s sqlite3 module (or with Datasette’s --load-extension= option) to gain access to the full suite of SQLite JSON functions—json(), json_extract() etc. # 10th January 2018, 9:01 pm


Datasette: instantly create and publish an API for your SQLite databases

I just shipped the first public version of datasette, a new tool for creating and publishing JSON APIs for SQLite databases.

[... 968 words]

Select Transform: JSON Template over JSON (via) A barrage of interesting ideas here. Having clients transmit up a JSON template which is then executed against data on the server and used to return exactly the data the client needs is just one of them (significant overlap with GraphQL there). # 18th October 2017, 5:12 pm


How does a web page interact with a server to parse a dynamic JSON file?

If you’re only dealing with 60 records there’s no need to add a full database. I’ve actually hand coded a 50 record JSON file before and it was fine- use an editor with good JSON support (I like Sublime Text 2) and it’s pretty easy to hand write.

[... 103 words]


Has JSON pretty much replaced XML for string processing for the web, or are there use cases where XML is still necessary?

It’s replaced XML as the default format for most APIs. XML is still necessary for Atom/RSS feeds and other existing standards built on top of XML. It’s also a better choice than JSON for markup-style data—stuff like XHTML where tags are applied to sequences of characters within larger chunks of text.

[... 81 words]

How can I parse unquoted JSON with JavaScript?

Unquoted JSON isn’t JSON—the JSON spec requires that strings are quoted (with double quotes, not single quotes).

[... 104 words]


What are the JSON security concerns in web development?

Be very careful when implementing JSON-P for authenticated actions—evil third party sites could assemble URLs to your user’s private data and steal it. This attack has worked against Gmail in the past.

[... 203 words]


Indexing JSON in Solr 3.1. The next release of Solr will support indexing documents provided as JSON—Solr currently requires incoming documents to be formatted as XML. # 10th December 2010, 9:46 am