Simon Willison’s Weblog

Entries tagged mysql

Filters: Type: entry × mysql ×


What are the key insights in mastering SQL queries?

You may find this article useful (despite the list-o-matic name): 10 Easy Steps to a Complete Understanding of SQL—I’ve been using SQL for years but I found that some of the concepts explained there helped firm up my fundamental understanding of how to use it effectively.

[... 64 words]

How was FriendFeed’s schema less db faster than pure MySQL?

The principle reason they switched to a schemaless DB was to work around the challenges of having to make schemes changes in MySQL, which can lock the table and take hours if bit days to complete in large tables.

[... 115 words]

Is there a maximum number of records one can fetch with a MySQL query?

To my knowledge there is no upper limit—that’s why good database libraries provide abstractions that let you iterate over large queries without loading the entire result set in to memory at once.

[... 54 words]

Should I store markdown instead of HTML into database fields?

You should store the exact format that was entered by the user.

[... 95 words]

How can I detect manual record insert from mysql cansole into my code in django .?

You can’t. The best you can do is have Django periodically poll MySQL to see if anything has changed (maybe with a custom management command run by cron)—having a TIMESTAMP field on every table which will be automatically set to the current time when a record is inserted will help you spot things that have changed.

[... 80 words]

What tools and techniques are used for relational database version control (structure and data)?

The term you are looking for is database migrations (sometimes called database change scripts).

[... 308 words]

Any source available to download sample data (in 10+ GB) for testing?

Wikipedia has some pretty interesting dumps, in both XML and SQL format: http://meta.wikimedia.org/wiki/I...

[... 100 words]

What is the most efficient way to lookup an object (e.g. a user) by only a string?

Yes—an index on a varchar column is exactly how you would implement this.

[... 38 words]

Python Django load MySQL database from csv files performance issue?

Don’t use the Django ORM for bulk imports—the performance overhead is pretty small for regular web page stuff, but it adds up if you are running millions of inserts.

[... 63 words]

How can you build a search engine for a website built in PHP/MySQL?

There are a bunch of options.

[... 310 words]

What are XML feed best practices?

It sounds like you’re pretty much screwed already, if you’re dealing with companies that still think FTPing XML around is a sensible thing to do.

[... 364 words]

Has anyone implemented a message queue with mysql and many workers?

Flickr built their own message queue on top of MySQL: http://code.flickr.com/blog/2008...

[... 29 words]

Is a relational database with many-to-many relationships difficult to develop into a web app?

Many to Many tables can be a bit of a pain to deal with using regular SQL, but a good ORM can abstract away any potential complexity almost entirely. I find using the Django ORM means I’m much less likely to shy away from a design that involves a many-to-many relationship because I know it won’t increase the complexity of the application. I imagine the Rails ORM has the same effect.

[... 91 words]

What are the pros and cons of switching from MySQL to one of the NoSQL databases?

Pro: If your own benchmarks tell you you need to switch to a specific NoSQL solution, you’ll know exactly what the pro is.

[... 227 words]

When should one switch from MySQL to Oracle or PostgreSQL?

When your own benchmarks prove that your application’s particular load characteristics will perform better on another database—and the difference is large enough that it’s worth the cost involved in retargeting your code. If that cost is high (and it probably will be) it may be worth paying for some expert consultants to ensure that your implementations against the different databases are properly optimised.

[... 102 words]

Crowdsourced document analysis and MP expenses

As you may have heard, the UK government released a fresh batch of MP expenses documents a week ago on Thursday. I spent that week working with a small team at Guardian HQ to prepare for the release. Here’s what we built:

[... 2051 words]

Getting from point A to B (the right way)

If your laptop is relatively recent it might have hardware support for virtualization (Intel Core Duo chips do, for example). If so, it’s worth looking in to using VMWare or Parallels to run a virtual linux server locally on your machine. You’ll need a fair amount of RAM for this as well—2 GB minimum probably.

[... 194 words]