Feed Sign in with OpenID OpenID

Simon Willison’s Weblog

How to track an RSS feed

According to the HTTP specification, RSS/Atom aggregators should obey the HTTP 301 Moved Permanently header by altering the stored subscription URL for the feed they are attempting to retrieve.

This behaviour can be used to track repeat aggregator hits to a feed, in essence the equivalent a setting a permanent cookie. The first time an aggregator hits the published feed address, a 301 header is served redirecting that aggregator to a new URL incorporating a unique ID. The aggregator permanently changes the stored subscription URL, meaning future request to that feed will carry the unique ID that was assigned the first time the feed was retrieved.

At its most innocent, this could allow people to track their number of unique subscriptions—although the value of this would be severely diluted if people started deliberately subscribing to the same redirected feed URL. I’m sure there are more insidious uses for this as well.

Maybe aggregators should prompt users when a feed has permanently moved, to prevent them from being tracked without their knowledge.

This is How to track an RSS feed by Simon Willison, posted on 1st September 2004.

View blog reactions

Next: The bookmarklet solution to the password problem

Previous: 1000th Blogmark

11 comments

  1. The downside, of course, is that your bandwidth use will shoot up due to the fact that public caches cannot work as effectively.

    I'm not particularly concerned about the privacy issues; the site owner will only have the personal information that their visitors provide in the first place. The fact that your feed habits can be distinguished from everyone else's means nothing as long as it's just an anonymous distinct reader.

    Jim Dabell - 1st September 2004 01:33 - #

  2. I think this would be a great way of tracking feed habits and statistics - even providing a way to limit individual offenders from bashing your server too hard. All you would need to do is limit the request for a specific subscriptions to 1 per (half?) hour.

    There is only one issue I have with all of this:

    "RSS/Atom aggregators should obey..."

    Does anyone know to what extent this is implemented by the aggregators? Using this method you do of course run the risk of excluding anyone who's aggregator is not conformant.

    While from a purist point of view this would be ideal, pushing people towards more conformant software, in practice this would probably be a Bad Thing.

    Noah Slater - 1st September 2004 11:59 - #

  3. Aggregators that don't follow HTTP redirects are broken. Treating a permanent redirect as temporary redirect is quite common, though.

    Fredrik - 1st September 2004 13:38 - #

  4. Why not just flip this concept? Instead of having a redirect to a unique URL, just present a unique URL in the first place for each page view. Or better yet present the same unique URL to the user, based on a stored browser cookie. This would completely get rid of any problems with aggregators not supporting permanent redirects.

    Frank Wiles - 1st September 2004 20:55 - #

  5. certainly is easier to implement than personalised etags ;-)

    eric scheid - 2nd September 2004 07:39 - #

  6. ... just present a unique URL in the first place for each page view.

    By this, do you mean that every time you display a link to the RSS/Atom file that it is generated with a unique ID (UID) encoded into it?

    I think this would be much better that displaying a link to a resource that has permanently moved. It makes more sense as people usually only subscribe once to a feed. You would however need to make sure that any UIDs that have been "activated" or requested are not displayed ever again.

    A little of topic, but, has anyone thought about enabling a feed to be customized?

    For example, if you categorised your posting in to 6 distinct topics you could allow the user to subscribe to a feed which, say, only displayed posts relating to "Pyhton", "PHP" and "Apache" instead of "CSS", "XML" and "Personal".

    From my own personal experience I usually have about 150 odd new posts to look through every day in NetNewsWire. I usually find that I only wanted to read about a third of these and the ability for me to filter peoples feeds would be invaluable.

    Just a thought. :)

    Noah Slater - 2nd September 2004 09:21 - #

  7. I agree that personalisation of feeds would be an important feature, esp for syndication purposes. Perhaps you could keep the same URL but just add some query string parameters?

    Chris Beach - 2nd September 2004 09:34 - #

  8. For example:

    http://www.example.com/feeds/blog.xml?id=28739&categories=1,2,5

    Noah Slater - 2nd September 2004 10:20 - #

  9. $HTTP_USER_AGENT

    internet - 2nd September 2004 10:32 - #

  10. Adrian Holovaty has had customised RSS feeds for a while. It's basically just down to whatever CMS you are using I guess.

    Ben Meadowcroft - 2nd September 2004 12:42 - #

  11. Using HTTP redirects is fine, except that this assumes the aggregator actually understands HTTP and follows these redirects.

    I think issuing a new unique ID with the RSS URL should be a better method, at least in most cases, so I coded a tiny script that demonstrates this, hope you give it a try and tell me what you think. You can read about too if you got some time to spend.

    Rami Kayyali - 12th September 2004 02:04 - #

Comments are closed.

Previously hosted at http://simon.incutio.com/archive/2004/09/01/track

A django site