Forum Moderators: phranque

Message Too Old, No Replies

Strange errors popping up in my logs

         

madmatt69

9:51 pm on Jul 14, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Hey all,

I've getting lots of errors popping up in my log files, such as:

"File does not exist: /home/widgets/public_html/gifhttp://www.widgets.com/http://www.widgets.com/gif"

No clue as to why, there's nothing I've found on the site that links to a folder called /gif. The same thing appears for /jpg.

I've done searches in Google for links from other sites that maybe link to these non-existent folders on my site, but none are out there.

I'm also getting similar errors where bots )like googlebot) are requesting URL's off of the main root.

So I have a folder say "widgets/bluewidgets/" and that's the only place it appears, and all links point to that location.

However the bot looks off the root for "/bluewidgets/" and gets a 404 because it's really in "widgets/bluewidgets/". This seems to be happening with lots of files. I thought maybe it had something to do with my use of include files, however all my links are absolute in the code, so I don't know how it would goof up the addresses. The regular pages still get spidered though.

Anyone else experienced something like this?

madmatt69

11:46 pm on Jul 14, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



It seems like most of these errors are being generated when someone links to an image on my site.

For example, google images. Or, xanga where some kid hot-links an image off my site. I've got hot link protection on, so it doesn't show, however I wonder if maybe that's what generating the error?

jdMorgan

12:41 am on Jul 15, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



It is very useful in these cases to cross-reference the error log to the raw access log by using the time stamp. Then you can see what IP address is making the request, look up the associated domain name of that IP, and check the referrer and the user-agent in the request. This will often give you enough information to decide if these are malformed requests from 'junk' robots, bad links on other sites, or problems within your own site.

You could also have problems if you have incorrectly implemented redirects or rewrites, or if you have bad "base URL" meta-tags on your pages. Basically, anything that confuses the user-agent as to how to canonicalize the URL. To emphasize that point, it is the browser or robot, and not the server, that resolves relative URLs if you use them. (I'm including this because 'relative,' 'absolute,' and 'canonical' URLs are all different things, and the terminology is often misused.)

Jim