lammert - 2:57 am on Oct 9, 2010 (gmt 0)
The <base href> should be referring to the published URL of the document
No, this is not true. The base href should be referring to the location which you want to use to build an absolute URL from if in the source code only a relative URL is mentioned. It may or may not be the same as the URL of the document [w3.org...] Both legitimate search engines and browsers parse the base href correctly if it differs from the URL of the document, but many scrapers crash on it. I therefore use deliberately a base href which differs from the main URL on almost all my sites. It helps prevent scraping and bad bots, without affecting legitimate visitors.