Welcome to WebmasterWorld Guest from

Forum Moderators: open

Message Too Old, No Replies

Best way to recognize googlebot in order to cloak (for redirection)?



12:45 am on Jul 11, 2008 (gmt 0)

5+ Year Member

Before you start saying "this is not allowed" and the likes I will tell you that this kind of cloacking that I need is done by massive websites (I'm talking about international, very well known multi-milion companies). I can't post the names here as it's not allowed.
Having a website in different languages for different countries, they do automatic redirection depending on the country you connect from (by using your ip address). However, they cloack their page in order to NOT redirect if it's a search engine or any bot. You can test this easily by changing your user-agent to a search engine's one and you will see that you won't be redirected.
I can't post here which websites I am talking about as it's not allowed to mention here, but if you spend some time researching you will find some yourself.

So back to the question. I have only heard of BrowserHawk as a tool for detecting bots. Do you think that this is the best choice? and what is used by the leaders? is there a way I can test if they are using it?
I would like to do the same as them, as we are in the same business and at least I need to compete at the same level.
If not browserhawk can u suggest me how to do it? any opensource alternatives? any custom code?
Any help would be very appreciated..


5:13 am on Jul 11, 2008 (gmt 0)

WebmasterWorld Administrator incredibill is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

Most major search engines can easily be detected using full trip DNS.

Some useful info here:

Hope that helps!


11:04 am on Jul 11, 2008 (gmt 0)

5+ Year Member

Yes thanks, it helped to clarify a lot of things.
My first guess with BrowserHawk was completely wrong :).

Anyway, I also read in other sources and they are suggesting SpiderSpy by fantomas.

I would have to pay 258$ every year, but at least seems they give a complete and updated list of IPs to which I can tweak the page (for good intents of course).

Would you trust this list?
Do you think google or the other big search engines would be able to go past this list of IPs authomatically if they want to test for cloaking?

I am not much worried about manual checks, as if the techniques used have good intents and the reviewer common sense the site shouldn't be banned (but yes it's guess work to know how tollerant they are). I'm only talking about automatic ways implemented by google to detect cloaking.


1:55 pm on Jul 11, 2008 (gmt 0)

5+ Year Member

I did some more research and the 3 market leaders in my field they all use "cloaking". The thing that is buggering me is that I spend the time to unerstand all I can, risks etc... and instead I have seen the market leaders do cloaking by using ONLY USER-AGENTs strings.
This is quite shocking to me. They present a complete different page to users and to bots, and don't even do reverse lookup of the ip address.
Do you see any reason why this might be their choice? Do you think that by making it so easy to find out (u don't even have to change ur Ip) they have less of a risk?
Why they are not paying attention to possible penalties? Maybe having pr above 6 makes them "untouchable" so they don't have to deal with any of these problems?
That wouldn't be faire..
Do you think I should follow their example?


6:34 pm on Jul 13, 2008 (gmt 0)

5+ Year Member

found an alternative solution in the end.. we decided on using google approved policies only to be in the safe side


7:41 am on Jul 18, 2008 (gmt 0)

so what's google approved policies, after all?


2:41 am on Sep 15, 2008 (gmt 0)

5+ Year Member

i have an issue with my pages <specifics removed>, when going to major pages where i have products listed that i want to market for example on the left hand pane i list all the products and when a user clicks on these products i check for a cookie, and if the cookie for country is not set the product pages redirect to a page to select country then redirects back to the calling page, this way once a user has selected US or Canada it never asks them for there country again and all prices are sent to their currency with shipping costs. Nice experience for the user they can go see all the different products after as they have the country cookie set and won't get asked again! Problem is googlebot and robots dont use cookies so it goes to the products page, checks if cookie is set to country and its not so it redirects to page to select country go back to the product item page and again cookie is not set so it bounces to select country page and keeps doing this. The end result is that its a nice experience for the user but googlebot never sees that product item page and it never gets indexed! Should i consider checking for robot and then if it is a robot show in default currency so they do get to see the product item page? Unfortuantely all pages that have product items do this, and how do i check for googlebot using ASP? any help would be appricated!

Thank you

[edited by: incrediBILL at 5:57 am (utc) on Sep. 15, 2008]
[edit reason] no specifics, see TOS [/edit]


3:36 am on Sep 15, 2008 (gmt 0)

WebmasterWorld Senior Member jdmorgan is a WebmasterWorld Top Contributor of All Time 10+ Year Member

Neither of the examples given here are ones where the black-hat definition of "cloaking" applies. Serving different language or currency content based on location is not "cloaking with intent to deceive" either visitors or search engines.

Using the user-agent or reverse-DNS lookup method to force a particular language or currency setting for a search engine robot is not maliciously deceptive.



2:27 am on Sep 24, 2008 (gmt 0)

5+ Year Member

i understand but here is some simple help if someone is faced with my position , instead of using code to detect googlebot by using the user-agent variable i just set a dummy cookie called testcookie to 1 then did a check if testcookie is 1 then redirect to force country check otherwise do nothing so it defaults the the USA the default country! This way im not looking for a string like googlebot or checking IPAddresses that may change over time! i am checking to see that this is a browser client that accepts cookies and if it does then only redirect to country select page otherwise do nothing but display the page so googlebot can index it. Now to my more pressing point how do i make the googlebot come quick :-) and when it comes it only gets my main page not the other pages is there a way to make it index pages quicker? sorry this post probably belongs elsewhere but a simple solution to my earlier problem of google skipping my pages will hopefully help someone in the future!

Featured Threads

Hot Threads This Week

Hot Threads This Month