exo : blah

content

Wed, 14 May 2003

The joy of standards

I'm sure I've mentioned this before but one of the things I do with my copious free time is polish my RSS to email aggregator. As part of this I, clearly, have to fetch the RSS from webservers. This, you would think, should be a straight forward thing as HTTP is a nice, simple, well defined protocol. You would be more or less right.

The most recent tweak was to make sure that permanent redirects were handled correctly, i.e. after it receives one the aggregator stores the new URI for that RSS feed and uses it in future. This was all fine until today when it told me:

    Failed to fetch from http://www.example.com/index.rss:
    URL must be absolute [400]

Huh? It's at this point that a brief explanation of how HTTP redirects work. If your browser requests a URI that the web server knows is somewhere else it will return an response code of 300 and something. It also returns a Location header that contains the URI that you are being redirected to. The browser will then go and request the new URI. All this happens behind the scenes and you're probably not even aware it's happened. If you are writing a client you have to write code to deal with this.

Now, if you read the HTTP Spec you'll see that the Location header is an absolute URI. That is, it should include the full URI to the new resource. It's at this point that you realise the cause of the above error: the Location header had a relative URI. A quick poke into the LWP::UserAgent source and the relevant code cut and pasted and all is well.

The biggest problem with standards is not that there as so many to chose from, it's that when people pick one they often don't actually read it very carefully. I know I've been, and no doubt will be, guilty of it.

posted at: 05:17 #

all the usual copyright stuff... [ copyright struan donald 2002 - present ], plus license