| 11:51 am on Jan 3, 2007 (gmt 0)|
And I see an other victim of this bug, anybody else seeing this?
| 12:34 pm on Jan 3, 2007 (gmt 0)|
<base href="http://www.-----.com/" />
I have it like this for years and never had a problem.
| 1:11 pm on Jan 3, 2007 (gmt 0)|
<base href="http://www.example.com/" /> can be a problem if you use it on a page that is in an interior directory.
For example, if the page is in the /news/ directory and it has a relative link to another page in the /news/ directory, the base tag above is telling the user agent not to use the /news/ directory in the file path, but to calculate it from the domain root. The correct value for the base href tag is the fully qualified absolute address of the page itself.
What kind of error recovery a bot might have at that point would only be guesswork for us, but technically it would have a error to cope with
| 2:13 pm on Jan 3, 2007 (gmt 0)|
tedster: you are totally right, and that's also exactly my point: all robots and browser do it right except Googlebot. Googlebot just ignores the base href element and calculates the wrong path.
Sometimes it can go recursive through a site and index url's like this: www.-----.com/news/contact/news/contact/news/contact/ etc.
Extra info: this is not something that is broken for ages, this is a new bug that I'm seeing more and more.
[edited by: NedProf at 2:22 pm (utc) on Jan. 3, 2007]
| 2:50 pm on Jan 3, 2007 (gmt 0)|
I remember one report like this last summer, and I was not able to pin down any reason for it that the site could be reponsible for. Just make sure your server is returning a true 404 status code for those bad urls. Many .NET sites return a 302 to serve a custom error page -- that can spell trouble, especially if you combine it with this issue.