Forum Moderators: DixonJones

Message Too Old, No Replies

Strange 404 Errors

Log says pages not found - but they're there!

         

madmatt69

1:57 pm on Jun 3, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Hi everyone - Question about some strange log activity. I've just analyzed my May log file, and I have a tonne of 404 errors. This isn't exclusive to May, I've seen it before, but now it's starting to bother me :)

The URL that shows a 404 error is always written with odd spaces in it. example: [mysite.com...] main/index.html

Note the space there.

But all the links are written correctly:
[mysite.com...]

I believe the errors are coming from spiders, but can anyone tell me why they do this? Why there are spaces appearing?

Thanks!

richlowe

2:29 pm on Jun 3, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Your example has an extra space in it which might be the problem. I once had someone link to a page who added a space to the END of the URL, and this caused many 404 errors. How to get around it? On Apache, I just created a page with a redirect which included the extra space.

Richard Lowe

dcheney

4:01 pm on Jun 3, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I also recently had a case where someone posted a link to my site and had a trailing comma at the end of the URL! Same fix - just made a dummy file with the comma that forwarded them to the proper page.

buckworks

6:39 pm on Jun 3, 2002 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



Is there some kind of catch-all way that a site could cope gracefully with anything that ends in .html. or .html, by correcting it to just .html? An URL followed by a comma or period is common in discussion groups, message boards, etc. and it would be nice to be able to seamlessly send visitors to the correct page without having to make a redirect for each page where you notice it happening. Any suggestions?

richlowe

7:55 pm on Jun 3, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



On IIS (Windows) you could write an ASP script to do it and use the ASP script as your 404 error page. I'd bet you can do it in the .htaccess file on Apache as well. If not, PHP could certainly do it.

Richard Lowe

madmatt69

7:22 am on Jun 4, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Thanks for the info but I must not have made the question clear :)

Yes, the space in the URL is the problem - That's what I'm asking about. In my code, in my html pages, of course I use proper syntax - no spaces. But, in the log files, I get 404 errors pointing to links with spaces in it! So say I have a file at www.blah.com/index.htm, the log file shows a 404 error at www.blah.com/ index.htm

But the question is - where is the extra space coming from? This is happening for several pages throughout the site.

I have a feeling that perhaps it's a spider of some sort that is having trouble and adding spaces to url's it's trying to grab. Otherwise, I'm sure I would have heard a user or two complain by now.

I can put up a snapshot of my logfile, if that helps clarify at all?

Thanks again guys - and very interesting about the re-direct. Maybe that could somehow help..But like I said, I don't think that it's actually a human that's getting these 404's, I believe it's a spider.

richlowe

3:20 pm on Jun 4, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



The space is almost certainly coming from people who have linked to the page and misspelled the URL. Happens to me all of the time.

Richard Lowe