homepage Welcome to WebmasterWorld Guest from
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Home / Forums Index / Marketing and Biz Dev / Cloaking
Forum Library, Charter, Moderator: open

Cloaking Forum

Best way to recognize googlebot in order to cloak (for redirection)?

5+ Year Member

Msg#: 3695891 posted 12:45 am on Jul 11, 2008 (gmt 0)

Before you start saying "this is not allowed" and the likes I will tell you that this kind of cloacking that I need is done by massive websites (I'm talking about international, very well known multi-milion companies). I can't post the names here as it's not allowed.
Having a website in different languages for different countries, they do automatic redirection depending on the country you connect from (by using your ip address). However, they cloack their page in order to NOT redirect if it's a search engine or any bot. You can test this easily by changing your user-agent to a search engine's one and you will see that you won't be redirected.
I can't post here which websites I am talking about as it's not allowed to mention here, but if you spend some time researching you will find some yourself.

So back to the question. I have only heard of BrowserHawk as a tool for detecting bots. Do you think that this is the best choice? and what is used by the leaders? is there a way I can test if they are using it?
I would like to do the same as them, as we are in the same business and at least I need to compete at the same level.
If not browserhawk can u suggest me how to do it? any opensource alternatives? any custom code?
Any help would be very appreciated..



WebmasterWorld Administrator incredibill us a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

Msg#: 3695891 posted 5:13 am on Jul 11, 2008 (gmt 0)

Most major search engines can easily be detected using full trip DNS.

Some useful info here:

Hope that helps!


5+ Year Member

Msg#: 3695891 posted 11:04 am on Jul 11, 2008 (gmt 0)

Yes thanks, it helped to clarify a lot of things.
My first guess with BrowserHawk was completely wrong :).

Anyway, I also read in other sources and they are suggesting SpiderSpy by fantomas.

I would have to pay 258$ every year, but at least seems they give a complete and updated list of IPs to which I can tweak the page (for good intents of course).

Would you trust this list?
Do you think google or the other big search engines would be able to go past this list of IPs authomatically if they want to test for cloaking?

I am not much worried about manual checks, as if the techniques used have good intents and the reviewer common sense the site shouldn't be banned (but yes it's guess work to know how tollerant they are). I'm only talking about automatic ways implemented by google to detect cloaking.


5+ Year Member

Msg#: 3695891 posted 1:55 pm on Jul 11, 2008 (gmt 0)

I did some more research and the 3 market leaders in my field they all use "cloaking". The thing that is buggering me is that I spend the time to unerstand all I can, risks etc... and instead I have seen the market leaders do cloaking by using ONLY USER-AGENTs strings.
This is quite shocking to me. They present a complete different page to users and to bots, and don't even do reverse lookup of the ip address.
Do you see any reason why this might be their choice? Do you think that by making it so easy to find out (u don't even have to change ur Ip) they have less of a risk?
Why they are not paying attention to possible penalties? Maybe having pr above 6 makes them "untouchable" so they don't have to deal with any of these problems?
That wouldn't be faire..
Do you think I should follow their example?


5+ Year Member

Msg#: 3695891 posted 6:34 pm on Jul 13, 2008 (gmt 0)

found an alternative solution in the end.. we decided on using google approved policies only to be in the safe side


Msg#: 3695891 posted 7:41 am on Jul 18, 2008 (gmt 0)

so what's google approved policies, after all?


5+ Year Member

Msg#: 3695891 posted 2:41 am on Sep 15, 2008 (gmt 0)

i have an issue with my pages <specifics removed>, when going to major pages where i have products listed that i want to market for example on the left hand pane i list all the products and when a user clicks on these products i check for a cookie, and if the cookie for country is not set the product pages redirect to a page to select country then redirects back to the calling page, this way once a user has selected US or Canada it never asks them for there country again and all prices are sent to their currency with shipping costs. Nice experience for the user they can go see all the different products after as they have the country cookie set and won't get asked again! Problem is googlebot and robots dont use cookies so it goes to the products page, checks if cookie is set to country and its not so it redirects to page to select country go back to the product item page and again cookie is not set so it bounces to select country page and keeps doing this. The end result is that its a nice experience for the user but googlebot never sees that product item page and it never gets indexed! Should i consider checking for robot and then if it is a robot show in default currency so they do get to see the product item page? Unfortuantely all pages that have product items do this, and how do i check for googlebot using ASP? any help would be appricated!

Thank you

[edited by: incrediBILL at 5:57 am (utc) on Sep. 15, 2008]
[edit reason] no specifics, see TOS [/edit]


WebmasterWorld Senior Member jdmorgan us a WebmasterWorld Top Contributor of All Time 10+ Year Member

Msg#: 3695891 posted 3:36 am on Sep 15, 2008 (gmt 0)

Neither of the examples given here are ones where the black-hat definition of "cloaking" applies. Serving different language or currency content based on location is not "cloaking with intent to deceive" either visitors or search engines.

Using the user-agent or reverse-DNS lookup method to force a particular language or currency setting for a search engine robot is not maliciously deceptive.



5+ Year Member

Msg#: 3695891 posted 2:27 am on Sep 24, 2008 (gmt 0)

i understand but here is some simple help if someone is faced with my position , instead of using code to detect googlebot by using the user-agent variable i just set a dummy cookie called testcookie to 1 then did a check if testcookie is 1 then redirect to force country check otherwise do nothing so it defaults the the USA the default country! This way im not looking for a string like googlebot or checking IPAddresses that may change over time! i am checking to see that this is a browser client that accepts cookies and if it does then only redirect to country select page otherwise do nothing but display the page so googlebot can index it. Now to my more pressing point how do i make the googlebot come quick :-) and when it comes it only gets my main page not the other pages is there a way to make it index pages quicker? sorry this post probably belongs elsewhere but a simple solution to my earlier problem of google skipping my pages will hopefully help someone in the future!

Global Options:
 top home search open messages active posts  

Home / Forums Index / Marketing and Biz Dev / Cloaking
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved