Simon Willison’s Weblog

Subscribe

September 2010

Sept. 1, 2010

Setting up Munin on Ubuntu. Useful guide to setting up my favourite graphing/monitoring tool for personal projects.

# 2:05 pm / ops, sysadmin, ubuntu, recovered, munin

Sept. 2, 2010

Who are major competitors to Solr?

ElasticSearch is a really interesting one—it’s the same underlying search library (Lucene) and the same integration model (an HTTP interface) but takes quite a different approach. It hasn’t been around for a long time but it looks very impressive: http://www.elasticsearch.com/

[... 95 words]

Sept. 3, 2010

The Seven Secrets of Successful Data Scientists. Some sensible advice, including pick the right sized tool, compress everything, split up your data, use open source and run the analysis where the data is.

# 12:36 am / data, big-data, recovered

Vox is closing on September 30, 2010. One month seems like very short notice for closing a service of this size, especially since it functions as an OpenID provider so in addition to migrating their content away users may need to sign in to other services and set up an alternative form of authentication. UPDATE: From the comments, Vox accounts that migrate to TypePad will also have their OpenID migrated, and TypePad will continue to serve OpenID requests for old vox.com addresses. Smart solution.

# 8:50 am / openid, sixapart, vox, recovered, closing

Sept. 5, 2010

ZeroMQ: Modern and Fast Networking Stack. I get ZeroMQ now. I was having trouble figuring out how it differed from things like RabbitMQ—it turns out it’s an entirely new low-level socket abstraction, designed to make common socket programming tasks like message sending/receiving and publish/subscribe a whole lot easier than dealing with raw BSD sockets.

# 7:41 pm / io, messaging, networking, sockets, zeromq, recovered

Sept. 8, 2010

Why do so many Internet sites end with the letter ’r’ (but not ’er’)?  Think about Tumblr, Dopplr, Migratr.  What’s behind this?

We just launched a project called lanyrd, which is a play on lanyard. We partly picked the name because the domain was available, but there’s actually a big advantage to using a made-up word: it’s really easy to search for coverage and feedback on Twitter, Google Blogsearch and the like. The string “lanyrd” is almost exclusively used to discuss our project—had we used a dictionary word, tracking down feedback would have been a lot harder.

[... 105 words]

Sept. 11, 2010

Welcome to Lanyrd | The Lanyrd Blog. We’ve started a blog for Lanyrd, our social conference directory project. We’re off to a great start: “Lanyrd is now listing 1,508 conferences and 5,167 individual speaker profiles. 5,637 people have signed in to the site and made 13,293 edits to our data.”

# 9:32 pm / blogging, conferences, projects, lanyrd, recovered

Sept. 16, 2010

While I don’t expect Twitter to master its own destiny as far as the decentralization of the medium goes, I do support the idea, and I hope that Twitter as a business can coexist with the need for the world to have a free, open, reliable, and verifiable way for humans to instantly communicate in a one-to-many fashion.

Alex Payne

# 11:07 am / alex-payne, decentralisation, twitter, recovered

Sept. 22, 2010

Creating Shazam in Java. Using a Fast Fourier Transformation.

# 9:39 pm / algorithms, java, recovered, shazam

A Gentle Introduction to CouchDB for Relational Practitioners. By “High Performance MySQL” author Baron Schwartz—a smart, concise overview that touches pretty much everything that’s interesting about CouchDB.

# 9:51 pm / couchdb, databases, recovered

ijson. A SAX-style streaming JSON parser for Python, using ctypes to talk to the yajl C library.

# 9:59 pm / ctypes, json, python, sax, recovered

Sept. 23, 2010

I pushed 20 more of my projects to GitHub. Some great Node.js stuff here from Peteris Krumins, including modules for processing PNG, JPEG and animated GIFs.

# 1:18 am / images, node, png, recovered, jpeg, peteris-krumins

Google Chrome Frame: Stable and Speedy (via) “Today, we’re very happy to take the Beta tag off of Google Chrome Frame and promote it to the Stable channel.”—MSI installer included, for IT administrators to easily deploy Chrome Frame to multiple machines.

# 1:34 am / chrome, chromeframe, recovered

evercookie—virtually irrevocable persistent cookies (via) Mischievous genius from the chap who created the MySpace worm—evercookie attempts to set an irrevocable cookie using a whole bunch of different methods, including “storing cookies in RGB values of auto-generated, force-cached PNGs using HTML5 Canvas tag to read pixels back out” and an extremely clever scheme built on top of the web history CSS visited link colour vulnerability.

# 1:35 am / recovered

Sept. 27, 2010

Why are tech conferences so expensive to attend?

Large conferences with big name speakers are expensive to organise. They are also priced to what the market will bear.

[... 103 words]

Will Redis support per-database persistence configuration?

I don’t know if that’s on the roadmap (you’d need to ask antirez on the mailing list or Twitter), but it should be easy enough to run multiple Redis instances with different settings—especially on a multi core machine.

[... 52 words]

What new tools and technologies you learnt recently are worth it?

Redis: http://simonwillison.net/2009/Oc... and http://simonwillison.net/static/...

[... 144 words]

The Web for me is still URLs and HTML. I don’t want a Web which can only be understood by running a JavaScript interpreter against it.

Me, on Twitter

# 4:37 pm / html, javascript, urls, recovered

Why are XSS attacks spreading like fire these days?

XSS attacks are common and easy, and crop up all the time. What’s new is that the number of people who are aware of the potential for XSS worms has increased hugely, so when an XSS does crop up in something popular there’s a much higher chance of someone turning it in to a worm (as happened with Twitter the other day).

[... 96 words]

What is the best conference for Web Designers in Australia to Attend?

I’ve not been, but I’ve always heard great things about Web Directions South. I attended/spoke-at @media in London run by the same team this year and it was excellent.

[... 71 words]

Sept. 30, 2010

Content management remains an unsolved problem. Untold billions of dollars (and hours) have been spent building commercial, open source, and custom content management systems since the first Web page was pushed to a Web server using FTP, and yet they all still suck.

Rafe Colburn

# 12:26 pm / contentmanagement, rafecolburn, recovered

Velocity: Forcing Gzip Compression. Almost every browser supports gzip these days, but 15% of web requests have had their Accept-Encoding header stripped or mangled, generally due to poorly implemented proxies or anti-virus software. Steve Souders passes on a trick used by Google Search, where an iframe is used to test the browser’s gzip support and set a cookie to force gzipping of future pages.

# 5:45 pm / browsers, gzip, performance, proxies, steve-souders, recovered

2010 » September

MTWTFSS
  12345
6789101112
13141516171819
20212223242526
27282930