Simon Willison's Weblog: pdfminer

Simon Willison's Weblog: pdfminerhttp://simonwillison.net/2008-08-03T15:29:40+00:00Simon WillisonPDFMiner2008-08-03T15:29:40+00:002008-08-03T15:29:40+00:00https://simonwillison.net/2008/Aug/3/pdfminer/#atom-tag

<p><strong><a href="http://www.unixuser.org/~euske/python/pdfminer/index.html">PDFMiner</a></strong></p> Useful looking PDF parsing library in Python—can produce an XML representation of the text and style information in a PDF document. <p>Tags: <a href="https://simonwillison.net/tags/pdf">pdf</a>, <a href="https://simonwillison.net/tags/pdfminer">pdfminer</a>, <a href="https://simonwillison.net/tags/python">python</a>, <a href="https://simonwillison.net/tags/screenscraping">screenscraping</a>, <a href="https://simonwillison.net/tags/xml">xml</a></p>