Automate the Boring Stuff with Python: Working with PDF and Word Documents

Automate the Boring Stuff with Python: Working with PDF and Word Documents. I stumbled across this while trying to extract some data from a PDF file (the kind of file with actual text in it as opposed to dodgy scanned images) and it worked perfectly: PyPDF2.PdfFileReader(open(“file.pdf”, “rb”)).getPage(0).extractText()

Posted 6th November 2019 at 4:17 pm

Simon Willison’s Weblog

Recent articles