Forum Moderators: phranque
I've getting lots of errors popping up in my log files, such as:
"File does not exist: /home/widgets/public_html/gifhttp://www.widgets.com/http://www.widgets.com/gif"
No clue as to why, there's nothing I've found on the site that links to a folder called /gif. The same thing appears for /jpg.
I've done searches in Google for links from other sites that maybe link to these non-existent folders on my site, but none are out there.
I'm also getting similar errors where bots )like googlebot) are requesting URL's off of the main root.
So I have a folder say "widgets/bluewidgets/" and that's the only place it appears, and all links point to that location.
However the bot looks off the root for "/bluewidgets/" and gets a 404 because it's really in "widgets/bluewidgets/". This seems to be happening with lots of files. I thought maybe it had something to do with my use of include files, however all my links are absolute in the code, so I don't know how it would goof up the addresses. The regular pages still get spidered though.
Anyone else experienced something like this?
You could also have problems if you have incorrectly implemented redirects or rewrites, or if you have bad "base URL" meta-tags on your pages. Basically, anything that confuses the user-agent as to how to canonicalize the URL. To emphasize that point, it is the browser or robot, and not the server, that resolves relative URLs if you use them. (I'm including this because 'relative,' 'absolute,' and 'canonical' URLs are all different things, and the terminology is often misused.)
Jim