Forum Moderators: open

Message Too Old, No Replies

Googlebot from Serbia?

         

mrjones

12:30 am on May 26, 2008 (gmt 0)

10+ Year Member



Heads up for this guy from Serbia,pretending to be Google Bot.
He seems to be working on sucking from my sitemap,perhaps to get around 403..dont know
========================================================

Host: 78.129.208.nnn

/sitemap/************92.html
Http Code: 200Date: May 25 11:48:01Http Version: HTTP/1.1Size in Bytes: 9437
Referer: -
Agent: Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)

/sitemap/
Http Code: 404Date: May 25 11:48:22Http Version: HTTP/1.1Size in Bytes: 5027
Referer: -
Agent: Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)

/disclaimer.php
Http Code: 200Date: May 25 11:48:24Http Version: HTTP/1.1Size in Bytes: 6377
Referer: -
Agent: Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)

[edited by: incrediBILL at 6:21 am (utc) on May 26, 2008]
[edit reason] obfuscated IP [/edit]

incrediBILL

7:16 am on May 27, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Most likely it's Google crawling through a proxy for some reason.

You have to ask yourself, if you aren't using a default sitemap file, how did anyone other than Google know the name of the file?

keyplyr

7:36 am on May 27, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I think it's just another bad bot spoofing Google. I get at least 1 or 2 a day.

incrediBILL

7:42 am on May 27, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



OK, explain how it knew a unique sitemap file?

Says "Http Code: 200" in the OP's post, that can't be a spoof if it knew.

mrjones

12:58 am on May 28, 2008 (gmt 0)

10+ Year Member



I dont know these things much Bill.
My sitemap is not even on my site for anyone to see..well there are no links to it from my site,and i have your white list in conjunction with a reverse rdns set up and it failed to stop the crawl
doing a quick look up i came up with
<snip>
Serbia

I see that it is from an Apartment?

[edited by: incrediBILL at 2:57 am (utc) on May 28, 2008]
[edit reason] removed specifics [/edit]

incrediBILL

2:59 am on May 28, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Looks like a little hosting company, regardless of the address, maybe the servers are in his closet, who knows ;)

Anyway, if Googlebot knew your real sitemap file name somehow they were tricked into crawling you via that location and I'm not sure how or why, but if it's a new trend in tricking Google, we'll see it happen more often and figure it out eventually.