Forum Moderators: phranque

Message Too Old, No Replies

bots and /\xa0

why would a bot give a request_uri with /\xa0

         

Hedgehog_UK

5:56 pm on Jul 3, 2010 (gmt 0)

10+ Year Member



While I was trawling through the log file, looking for weaknesses in the htaccess file, I came across

GET /\xa0 HTTP/1.1 404 1203 - Mozilla/5.0 (compatible; DotBot/1.1; http://www.dotnetdotcom.org/, crawler@dotnetdotcom.org)

Why would a bot request a GET /\xa0

What was it looking for?

[edited by: phranque at 12:39 am (utc) on Jul 4, 2010]
[edit reason] make url visible [/edit]

encyclo

7:29 pm on Jul 3, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Welcome to WebmasterWorld Hedgehog_UK :)

\xa0
is hex for a non-breaking space (
 
is the equivalent HTML entity). Maybe the bot is following a broken link?

Hedgehog_UK

9:46 pm on Jul 3, 2010 (gmt 0)

10+ Year Member



Thanks encyclo. I've recently been tuning htaccess - cache settings and bot handling etc. Googling a query often pointed me towards WebmasterWorld, so I thought it was about time I registered and joined in on the fun :)

As for \xa0, you were right. The bot had been trying to follow a rogue link with a .htl extension. What doesn't help is that it's a known problem on another website :(

I've seen comments in the past about \xa0 being used to take advantage of a weakness in some server systems. I didn't know it was also relevant to broken links. Then again, it may have been bots doing this kind of thing that revealed the weakness in the first place.

Many thanks

phranque

11:46 pm on Jul 3, 2010 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



welcome to WebmasterWorld [webmasterworld.com], Hedgehog_UK!