Simon Willison’s Weblog

Subscribe

Items tagged mysql in 2009

Filters: Year: 2009 × mysql × Sorted by date


Crowdsourced document analysis and MP expenses

As you may have heard, the UK government released a fresh batch of MP expenses documents a week ago on Thursday. I spent that week working with a small team at Guardian HQ to prepare for the release. Here’s what we built:

[... 2081 words]

Fixing Poor MySQL Default Configuration Values. Some tips from Jeremy Zawodny on configuring MySQL for high traffic environments—he suggests skip-name-resolve, connect_timeout=20, thread_cache_size=not-zero, max_connect_errors=very-high-number, slave_net_timeout=30. # 9th November 2009, 5:07 pm

Django-Jython 1.0.0 released! Now with database backends for PostgreSQL, Oracle and MySQL. The next release (planned for next month) should provide full compatibility with Django 1.1—the current release has 1.1 support for PostgreSQL but only 1.0 support for the other two databases. # 9th November 2009, 1:53 pm

MySQL backups with EBS snapshots. Assaf Arkin’s 45 line ruby script shows how to lock tables / XFS freeze / create an EBS snapshot / unfreeze and unlock, with hourly snapshots preserved for the past 24 hours and daily snapshots for the past week. Is an EBS snapshot enough to restore your data to somewhere other than EC2 though? # 13th October 2009, 12:34 pm

MySQL Connector/Python. A pure Python implementation of the MySQL client/server protocol, meaning you can talk to a MySQL server from Python without needing to first install the MySQL client libraries (which often requires compiling from source). # 2nd October 2009, 2:16 pm

MySQL, Python and MacOS X 10.6 (Snow Leopard). I gave up on compiling things when I upgraded to Snow Leopard—I’m back to running Ubuntu in a VMWare instance, mounted over Samba so I can still use TextMate. # 25th September 2009, 10:14 pm

Ravelry. Tim Bray interviews Casey Forbes, the single engineer behind Ravelry, the knitting community that serves 10 million Rails requests a day using just seven physical servers, MySQL, Sphinx, memcached, nginx, haproxy, passenger and Tokyo Cabinet. # 3rd September 2009, 6:50 pm

How to find un-indexed queries in MySQL, without using the log (via) Use tcpdump(!) to sniff the MySQL protocol and dump out queries that had the “no index used” bit set. # 19th August 2009, 11:42 am

SQL pie chart. Generating ASCII art pie charts using the world’s scariest MySQL SELECT statement. # 13th August 2009, 1:04 pm

Why we migrated from MySQL to MongoDB. Includes some useful information on MongoDB’s limitations—for example, running many different collections can waste disk space and repairing large datasets or bulk deleting many rows can block and lock the database for the duration of the operation. # 27th July 2009, 10:49 am

Installing Django, Solr, Varnish and Supervisord with Buildout. Useful, detailed instructions... but I still think this stuff is Way Too Difficult at the moment. I’m a big fan of the idea of sites that are assembled from multiple smaller web services talking HTTP to each other, but ensuring all the moving parts stay running is massively more painful than just running Apache and MySQL. # 7th June 2009, 1:54 pm

peeping into memcached. “Peep uses ptrace to freeze a running memcached server, dump the internal key metadata, and return the server to a running state”—you can then load the resulting data in to MySQL using LOAD LOCAL INFILE and analyse it using standard SQL queries. # 20th April 2009, 6:35 pm

Sphinx 0.9.9-rc2 is out. Interesting new feature: the Sphinx search server now supports the MySQL binary protocol, so you can talk to it using a regular MySQL client library and fire off search queries using SELECT syntax and the new SphinxQL query language. # 8th April 2009, 1:59 pm

Introducing Digg’s IDDB Infrastructure. IDDB is Digg’s new infrastructure component for sharding data across multiple databases, with support for both MySQL and memcachedb. “The DiggBar and URL minifying service is powered by a 16 machine IDDB cluster, which includes 8 write masters in the index and 8 MySQL storage nodes.” # 3rd April 2009, 8:42 pm

Concurrence. Exciting: a Python framework for “creating massively concurrent network applications” (the tutorial benchmarks a Hello World web server at over 8,000 requests a second). It’s implemented on top of libevent using pyrex, can run on either Stackless Python or Greenlets from the py library and ships with a WSGI server, an HTTP client and a DBAPI 2.0 compliant MySQL driver. # 15th March 2009, 1:28 pm

[Drizzle] won’t be a get-out-of-jail-free card for very write-heavy applications but I bet it will do wonders for heavily replicated, heavily federated, read-heavy architectures (you know, normal stuff).

Richard Crowley # 8th March 2009, 6:05 pm

Database Sharding at Netlog, with MySQL and PHP. Detailed MySQL sharding case study from Netlog, who serve five billion page requests a month using thousands of shards across more than 80 database servers. # 2nd March 2009, 10:22 am

How FriendFeed uses MySQL to store schema-less data. The pain of altering/ adding indexes to tables with 250 million rows was killing their ability to try out new features, so they’ve moved to storing pickled Python objects and manually creating the indexes they need as denormalised two column tables. These can be created and dropped much more easily, and are continually populated by an off-line index building process. # 27th February 2009, 2:33 pm

New Gearman Server & Library in C, MySQL UDFs. Gearman, the job queue written for LiveJournal and now used by Digg and Yahoo!, has been rewritten in C. Looks like a good candidate for an easily configured lightweight message queue. Also includes hooks for writing MySQL functions that can interact with queues. # 13th January 2009, 4:41 pm