Simon Willison’s Weblog

10 items tagged “pandas”

How to rewrite your SQL queries in Pandas, and more (via) I still haven’t fully internalized the idioms needed to manipulate DataFrames in pandas. This tutorial helps a great deal—it shows the Pandas equivalents for a host of common SQL queries. # 19th April 2018, 6:34 pm

Analyzing my Twitter followers with Datasette

I decided to do some ad-hoc analsis of my social network on Twitter this afternoon… and since everything is more fun if you bundle it up into a SQLite database and publish it to the internet I performed the analysis using Datasette.

[... 1314 words]

How to turn a list of JSON objects into a Datasette. ramadis on GitHub cleaned up data on 184,879 crimes reported in Buenos Aires since 2016 and shared them on GitHub as a JSON file. Here are my notes on how to use Pandas to convert JSON into SQLite and publish it using Datasette. # 20th January 2018, 1:07 am

Big Data Workflow with Pandas and SQLite (via) Handy tutorial on dealing with larger data (in this case a 3.9GB CSV file) by incrementally loading it into pandas and writing it out to SQLite. # 28th November 2017, 11:02 pm

Exploring Line Lengths in Python Packages. Interesting exploration of the impact if the 79 character length limit rule of thumb on various Python packages—and a thoroughly useful guide to histogram plotting in Jupyter, pandas and matplotlib. # 10th November 2017, 3:34 pm

A Minimalist Guide to SQLite. Pretty comprehensive actually—covers the sqlite3 command line app, importing CSVs, integrating with Python, Pandas and Jupyter notebooks, visualization and more. # 2nd November 2017, 1:23 am

Exploring United States Policing Data Using Python. Outstanding introduction to data analysis with Jupyter and Pandas. # 29th October 2017, 4:58 pm

Streaming Dataframes. This is some deep and brilliant magic: Matthew Rocklin’s Streamz Python library provides some elegant abstractions for consuming infinite streams of data and calculating cumulative averages and rolling reductions... and now he’s added an integration with jupyter that lets you embed bokeh graphs and pandas dataframe tables that continue to update in realtime as the stream continues! Check out the animated screenshots, this really is a phenomenal piece of work. # 19th October 2017, 2:25 pm

Generating interactive HTML charts from Python?

D3 is absolutely amazing but the learning curve is a bit steep. Totally worth the effort to learn it in the long run, but it’s not so useful if you want to get something done quickly.

[... 97 words]

Panda Tuesday; The History of the Panda, New APIs, Explore and You. Flickr’s Rainbow Vomiting Panda of Awesomeness now has a family of associated APIs. # 4th March 2009, 11:49 am