Forum Moderators: open

Message Too Old, No Replies

differences between real and fake googlebots

how to recognize real googlebot

         

hurgada

10:42 am on Feb 22, 2006 (gmt 0)

10+ Year Member



hello,

what is the best way to identify googlebot?
are all googlebots using isp that looks like crawl-?-?-?-?.googlebot.com?

heard that googlebot can visit page in different ip or isp address to check if the content is the same you're displaying to ordinary user and googlebot.

volatilegx

3:14 pm on Feb 22, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Hi hurgada and welcome to WebmasterWorld :)

Don't always trust the hostname (such as crawlerxx.googlebot.com) when trying to verify a Googlebot spider. I have seen this spoofed.

Check the whois records on the IP address. If it is registered to Google or Savvis, then you have a legitimate Googlebot.

hurgada

3:55 pm on Feb 22, 2006 (gmt 0)

10+ Year Member



thanks for the tip, volatilegx.

so i figured out that there was NOT googlebot crawling my site. so i banned ip address. But if other user has the same ip address, he will not be able to access my site either?
What would be the best way to ban fake bots?

volatilegx

6:51 pm on Feb 22, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



But if other user has the same ip address, he will not be able to access my site either?
What would be the best way to ban fake bots?

Right. If you are worried about this, you could ban bots by IP Address AND User Agent.

hurgada

8:43 am on Mar 25, 2006 (gmt 0)

10+ Year Member



is yahoo and msn bots registered accordingly to yahoo and msn or registration may be different as google and savvy?

volatilegx

2:08 am on Mar 27, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



A lot of Yahoo! bots are registered to other Yahoo! properties, but you can always trace them back to Yahoo!

MSN has bots on hotjobs and hotmail IPs, as well as several other MS properties.

netchicken1

3:10 am on Mar 27, 2006 (gmt 0)

10+ Year Member



Just a noob question.... why would you want to disguise yourself as the googlebot?

volatilegx

3:52 am on Mar 28, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



People sometimes do it to see if their competitors are cloaking. Some cloakers use the User Agent string to determine what to show the requestor of a web page... known as user agent cloaking.

Savy cloakers do not use user agent cloaking -- instead they use IP address cloaking, or possibly IP address and user agent cloaking in combination. With IP address cloaking, the IP addresses of search engine spiders are kept in a database and the (IP address of the) requestor of a web page is compared to the IP addresses in the database. If a match is found, then the visitor is identified as a search engine spider, and optimized HTML is served.