Simon Willison’s Weblog

Analyzing US Election troll tweets with Datasette

6th August 2018

FiveThirtyEight published nearly 3 million tweets from accounts associated with the Russian “Internet Research Agency”, based on research by Darren Linvill and Patrick Warren at at Clemson University.

FiveThirtyEight’s tweets were shared as CSV, so I’ve used my csvs-to-sqlite tool to convert them and used Datasette to publish them in a searchable, browsable interface: https://russian-troll-tweets.datasettes.com/

The data is most interesting if you apply faceting. Here’s the full set of tweets faceted by author, language, region, post type and account category:

Faceted search interface showing Russian Troll Tweets

The minimal source code for this Datasette instance is on GitHub.

Posted 6th August 2018 at 3:15 pm · Follow me on Mastodon, Bluesky, Twitter or subscribe to my newsletter

More recent articles

The new GPT-5.6 family: Luna, Terra, Sol - 9th July 2026
sqlite-utils 4.0, now with database schema migrations - 7th July 2026
sqlite-utils 4.0rc2, mostly written by Claude Fable (for about $149.25) - 5th July 2026

This is Analyzing US Election troll tweets with Datasette by Simon Willison, posted on 6th August 2018.

datasette 1,526

Next: Analyzing US Election Russian Facebook Ads

Previous: Documentation unit tests

Monthly briefing

Sponsor me for $10/month and get a curated email digest of the month's most important LLM developments.

Pay me to send you less!

Sponsor & subscribe