PDFMiner. Useful looking PDF parsing library in Python—can produce an XML representation of the text and style information in a PDF document.
Recent articles
- A selfish personal argument for releasing code as Open Source - 24th January 2025
- Anthropic's new Citations API - 24th January 2025
- Six short video demos of LLM and Datasette projects - 22nd January 2025