Simon Willison’s Weblog

Items tagged xml in Aug

Filters: Month: Aug × xml ×

minixsv (via) As far as I can tell, this is the only library that can validate XML using pure Python (no C extension required). I’d be extremely happy if someone would write a pure Python library (or one that only depends on ElementTree, which is included in the standard library) for validating XML against a Relax NG Compact syntax schema. Even DTD validation would be better than nothing! # 12th August 2009, 4:59 pm

cascadenik: cascading sheets of style for mapnik. Great idea. Mapnik (the open source tile rendering system used by OpenStreetMap and others) has a complex style configuration based on XML. Michal Migurski has build a CSS-style equivalent which compiles down to XML, hopefully making it much quicker and easier to get started with Mapnik customisation. # 30th August 2008, 10:04 am

Tip: Configure SAX parsers for secure processing. Explains the billion laughs attack, among others. # 23rd August 2008, 11:12 am

DoS vulnerability in REXML. Ruby’s REXML library is susceptible to the “billion laughs” denial of service attack where recursively nested entities expand a single entitity reference to a billion characters (kind of like the exploding zip file attack). Rails applications that process user-supplied XML should apply the monkey-patch ASAP; a proper gem update is forthcoming. # 23rd August 2008, 11:11 am

My Universal Feed Parser was conceived as a weapon against what I considered the gravest error of XML: draconian error handling. Recently, someone asked me to implement a switch that makes it not fall back on lax parsing in the case of an XML wellformedness error. I said no, not because it would be difficult to implement, but because that defeats its entire reason for being.

Mark Pilgrim # 5th August 2008, 10:52 pm

PDFMiner. Useful looking PDF parsing library in Python—can produce an XML representation of the text and style information in a PDF document. # 3rd August 2008, 3:29 pm

Atom Models. Building Python classes that act as utility wrappers around data stored in an lxml DOM object. # 7th August 2007, 4:02 pm

Parsing XML can open network sockets (via) Yikes. Something to bare in mind. # 18th August 2006, 2:27 pm