Simon Willison’s Weblog

Subscribe

Monday, 1st August 2022

storysniffer (via) Ben Welsh built a small Python library that guesses if a URL points to an article on a news website, or if it’s more likely to be a category page or /about page or similar. I really like this as an example of what you can do with a tiny machine learning model: the model is bundled as a ~3MB pickle file as part of the package, and the repository includes the Jupyter notebook that was used to train it.

# 11:40 pm / machine-learning, ben-welsh, python, jupyter

2022 » August

MTWTFSS
1234567
891011121314
15161718192021
22232425262728
293031