Simon Willison’s Weblog


Blogmarks tagged shotscraper

Filters: Type: blogmark × shotscraper × Sorted by date

shot-scraper 1.4. I decided to add HTTP Basic authentication support to shot-scraper today and found several excellent pull requests waiting to be merged, by Niel Thiart and mhalle.

1.4 adds support for HTTP Basic auth, custom --scale-factor shots, additional --browser-arg arguments and a fix for --interactive mode. # 5th February 2024, 11:11 pm

Migrating out of PostHaven. Amjith Ramanujam decided to migrate his blog content from PostHaven to a Markdown static site. He used shot-scraper (shelled out to from a Python script) to scrape his existing content using a snippet of JavaScript, wrote the content to a SQLite database using sqlite-utils, then used markdownify (new to me, a neat Python package for converting HTML to Markdown via BeautifulSoup) to write the content to disk as Markdown. # 24th May 2023, 7:38 pm

Examples of sites built using Datasette (via) I gave the examples page on the Datasette website a significant upgrade today: it now includes screenshots (taken using shot-scraper) of six projects chosen to illustrate the variety of problems Datasette can be used to tackle. # 29th January 2023, 3:40 am

Leveraging ’shot-scraper’ and creating image diffs. Üllar Seerme has a neat recipe for using shot-scraper and ImageMagick to create differential animations showing how a scraped web page has visually changed. # 24th October 2022, 9:34 pm

Dumping the HTML of a page using shot-scraper. New in 1.0 is the “shot-scraper html URL” command, which outputs the HTML of a page once JavaScript has finished executing there. You can pass in additional custom JavaScript to run before the shapshot is taken, and you can also specify a CSS selector on the page to return just that fragment of HTML. # 15th October 2022, 9:30 pm

shot-scraper 1.0 (via) Only a minor release in terms of features, but I decided that I’m comfortable enough with the CLI design at this point that I’m ready to stamp a 1.0 on it and commit to not making backwards-incompatible changes (at least without shipping a 2.0 release, which I’d like to avoid if possible). # 15th October 2022, 9:28 pm

simonw/datasette-screenshots (via) I started a new GitHub repository to automate taking screenshots of Datasette for marketing purposes, using my shot-scraper browser automation tool. # 17th May 2022, 5:56 pm

@newshomepages (via) Ben Welsh used my shot-scraper tool and GitHub Actions to launch a Twitter bot which tweets screenshots of newspaper homepages on a scheduled basis. Ben says: “The tech is so easy, I was able to pull it off in a couple hours at zero cost. A decade ago I ran a similar project using the cloud resources of the day. [...] It costs thousands of dollars and the screenshots were of much lower quality. Incredible progress!” # 12th March 2022, 7:21 pm