Forum Moderators: DixonJones

Message Too Old, No Replies

bad referer someplace on the net

filename with a space after the .htm

         

nancyb

11:04 am on Feb 5, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I'm getting lots of 404s for a file that shows a space (%20) after the .htm. It isn't in my pages because I've run all kinds of searches on hd and I can't find it on the web anyplace.

Any ideas how to go about finding/correcting something like this or is this just msnbot that got discombobulated?

65.xx.x x.141 - - [05/Feb/2005:01:34:39 -0700] "GET /file-name.htm%20 HTTP/1.0" 404 3013 "-" "msnbot/1.0 (+http://search.msn.com/msnbot.htm)"

pendanticist

11:23 am on Feb 5, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Could that be the way your server logs present the "?"

If so, I've asked before why there was a? at the end of my root url. The concensus had to do with folks who tag that onto the end to verify a current page, as opposed to cached.

I can't find the chart now, but I know there is something on my machine that breaks these down to the actual code.

I'm thinking, like %22 is the " and so on.

Other such instances became a problem for me once and I just re-directed that bogus referral to a valid extension.

<added> Cool. :) Thanks Sanenet.< /added>

[edited by: pendanticist at 11:48 am (utc) on Feb. 5, 2005]

Sanenet

11:44 am on Feb 5, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



? is %3F. ' ' (space) is %20.

There must be a space in an href somewhere. Strange that the referrer doesn't show up in your logs, have you tried to capture the cgi.http_referer header in your 404 page?

Have you tried searching in msn for a pagename with a space in it?

larryhatch

1:15 pm on Feb 5, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



ALL my pages are type /filename.html. Not a fn.htm in the lot.
Every once in a while I'll see a couple of filename.htm entries in my logs.
I never could figure it out, they didn't 404 or anything.
I presume some search engine reported a mistaken URL. -Larry