Safe HTML checker
23rd February 2003
I’ve finally enabled a subset of HTML in my comments. In doing so, I had several requirements that needed to be fulfilled:
- Entered markup must be valid to XHTML strict, to stop comments form breaking validation and keep things nice and tidy.
- No presentational markup! I want to maintain control over how things look via my stylesheets—comments posted should only be able to use structural HTML elements.
- I should retain full control over the tags and attributes allowed in the comments.
- Submitted HTML must be kept free from anything that could pose a security risk, such as
The system I have implemented works by running submitted posts through an XML parser, which checks that each element is in my list of allowed elements, is nested correctly (you can’t put a
blockquote inside a
p for example) and doesn’t have any illegal attributes. My initial test have shown it to work pretty well, but if anyone wants to have a go at breaking it please, be my guest.
The code for the main class is available here: SafeHtmlChecker.class.php
More recent articles
- Datasette Enrichments: a new plugin framework for augmenting your data - 1st December 2023
- llamafile is the new best way to run a LLM on your own computer - 29th November 2023
- Prompt injection explained, November 2023 edition - 27th November 2023
- I'm on the Newsroom Robots podcast, with thoughts on the OpenAI board - 25th November 2023
- Weeknotes: DevDay, GitHub Universe, OpenAI chaos - 22nd November 2023
- Deciphering clues in a news article to understand how it was reported - 22nd November 2023
- Exploring GPTs: ChatGPT in a trench coat? - 15th November 2023
- Financial sustainability for open source projects at GitHub Universe - 10th November 2023
- ospeak: a CLI tool for speaking text in the terminal via OpenAI - 7th November 2023
- DALL-E 3, GPT4All, PMTiles, sqlite-migrate, datasette-edit-schema - 30th October 2023