Simon Willison’s Weblog

Subscribe

Items tagged scraping, semanticweb in Mar, 2023

Filters: Year: 2023 × Month: Mar × scraping × semanticweb × Sorted by date


I expect GPT-4 will have a LOT of applications in web scraping

The increased 32,000 token limit will be large enough to send it the full DOM of most pages, serialized to HTML—then ask questions to extract data

Or... take a screenshot and use the GPT4 image input mode to ask questions about the visually rendered page instead!

Might need to dust off all of those old semantic web dreams, because the world’s information is rapidly becoming fully machine readable

Me # 16th March 2023, 1:09 am

Types

Years

Months

Tags