Simon Willison’s Weblog

7 items tagged “parsing”

Parsing CSV using ANTLR and Python 3. I’ve been trying to figure out how to use ANTLR grammars from Python—this is the first example I’ve found that really clicked for me. # 6th April 2018, 2:33 pm

Parse shell one-liners with pyparsing. Neat introduction to the pyparsing library, both for parsing tokens into labeled sections and constructing an AST from them. # 22nd October 2017, 1:35 pm

Every time you attempt to parse HTML with regular expressions, the unholy child weeps the blood of virgins, and Russian hackers pwn your webapp. Parsing HTML with regex summons tainted souls into the realm of the living. HTML and regex go together like love, marriage, and ritual infanticide.

Andrew Clover # 16th November 2009, 10:32 am

HTML 5 Parsing. Firefox nightlies include a new parser that implements the HTML5 parsing algorithm (disabled by default), which uses C++ code automatically generated from Henri Sivonen’s Java parser first used in the HTML5 validator. # 11th July 2009, 11:36 pm

South’s Design. Andrew Godwin explains why South resorts to parsing your file in order to construct information about for creating automatic migrations. # 13th May 2009, 12:30 pm

Simple Top-Down Parsing in Python. Eye-opening tutorial on building a recursive descent parser for Python, in Python that uses top-down operator precedence. # 19th July 2008, 11:37 pm

python4ply tutorial. python4ply is a parser for Python written in Python using the PLY toolkit, which compiles to Python bytecode using the built-in compiler module. The tutorial shows how to use it to add support for Perl-style 1_000_000 readable numbers. # 11th March 2008, 5:49 am