Simon Willison’s Weblog

Subscribe

7 items tagged “regex”

2019

Weeknotes: ONA19, twitter-to-sqlite, datasette-rure

I’ve decided to start writing weeknotes for the duration of my JSK fellowship. Here goes!

[... 919 words]

2010

Introduction to Surlex. A neat drop-in alternative for Django’s regular expression based URL parsing, providing simpler syntax for common path patterns. # 11th April 2010, 7:23 pm

RE2: a principled approach to regular expression matching. Google have open sourced RE2, the C++ regular expression library they developed for Google Code Search, Sawzall, Bigtable and other internal projects. Unlike PCRE it avoids the potential for exponential run time and unbounded stack usage and guarantees that searches complete in linear time, mainly by dropping support for back references. # 12th March 2010, 9:28 am

2009

Request Routing With URI Templates in Node.JS. I quite like this approach (though the implementation is a bit “this” heavy for my taste). JavaScript has no equivalent to Python’s raw strings, so regular expression based routing ala Django ends up being a bit uglier in JavaScript. URI template syntax is more appealing. # 24th November 2009, 9:06 am

Every time you attempt to parse HTML with regular expressions, the unholy child weeps the blood of virgins, and Russian hackers pwn your webapp. Parsing HTML with regex summons tainted souls into the realm of the living. HTML and regex go together like love, marriage, and ritual infanticide.

Andrew Clover # 16th November 2009, 10:32 am

2008

Python gems of my own (via) Did you know you can pass 128 as a flag to Python’s re.compile() function to spit out a parse tree? I didn’t. re.compile(“pattern”, 128) # 3rd November 2008, 11:59 am

2007

Primality regex. A regular expression that can identify prime numbers. Unsurprisingly, this one comes from the Perl community. # 18th March 2007, 1:17 am