Simon Willison’s Weblog

Atom feed for lxml

7 posts tagged “lxml”


Introducing Cloudera Desktop. It’s a GUI for Hadoop, and under the hood is a whole stack of open source software, including Python, Django, MooTools, Twisted, lxml, CherryPy, Mako, Java and AspectJ.

# 21st October 2009, 6:48 pm / hadoop, open-source, cloudera, python, django, mootools, twisted, lxml, cherrypy, mako, java, aspectj


How to install lxml python module on mac os 10.5 (leopard). Instructions that work! Finally, I can find out what all the fuss is about.

# 15th December 2008, 12:05 am / lxml, python, osx, leopard, xml, libxml2

lxml: an underappreciated web scraping library. I just wish I could get the wretched thing to install on OS X Leopard without resorting to MacPorts.

# 11th December 2008, 9:54 am / lxml, macports, python, ian-bicking, scraping

pyquery. “A jQuery-like library for Python”—implemented on top of lxml, providing jQuery style methods for manipulating an HTML or XML document.

# 6th December 2008, 9:53 am / jquery, pyquery, python, lxml, xml


lxml.cssselect (via) lxml includes an implementation of CSS 3 selectors, which compiles them to XPath expressions. Should be a useful tool for parsing Microformats from Python.

# 24th September 2007, 11:57 pm / python, lxml, libxml2, css, selectors, xpath, css3, microformats

Atom Models. Building Python classes that act as utility wrappers around data stored in an lxml DOM object.

# 7th August 2007, 4:02 pm / lxml, dom, xml, python, ian-bicking, atom