24th October 2025
Research
Blog Tag Prediction with Scikit-Learn
— Automatically assigning meaningful tags to historic, untagged blog posts, this project leverages the Simon Willison blog database and scikit-learn to train and compare multi-label text classification models. Four approaches—TF-IDF + Logistic Regression, Multinomial Naive Bayes, Random Forest, and LinearSVC—were tested on posts’ title and body text using the 158 most frequently used tags.
Recent articles
- Perhaps not Boring Technology after all - 9th March 2026
- Can coding agents relicense open source through a “clean room” implementation of code? - 5th March 2026
- Something is afoot in the land of Qwen - 4th March 2026