homepage Welcome to WebmasterWorld Guest from
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Home / Forums Index / Marketing and Biz Dev / Cloaking
Forum Library, Charter, Moderator: open

Cloaking Forum

Now I'm scared!
Do SE's know my pages are coming from a cgi script?

 6:37 pm on Apr 16, 2001 (gmt 0)


I'm using a *simple* cgi script to redirect visitors based on IP address. I'm feeding visitors and SE's pages that live above my document root so I do not worry about spiders stumbling upon them and finding pages I don't want them to see.

Reading this forum I followed a few links off site and found some schmooze about SE's being able to tell that the page I'm feeding them is coming from a cgi script and as a result I can get penalized. The answer to this problem, of course, is to purchase Joe Schmo's superior product.

My question(s) is this. How much truth is there in this statement, and barring cgi, what would be the optimum method for super stealth cloaking?

Thanks much,


The script I use is basically this...

index.cgi is the "home" page.

Visitor comes in and index.cgi checks the IP address via HTTP_ADDR and compares it to a list of IP addresses associated with spiders.

if ($ENV{'REMOTE_ADDR'} =~ /$ipaddress/)

then, IP (address,spider) is set to true and an index.html page is read from a directory above the document root where no visitor has access to except my script since it's in my UID. The script then reads each line of the optimized index.html page and feeds it to the "visitor"/"spider" via

foreach $line (@lines)
print "$line";

Edited by: awoyo



 7:42 pm on Apr 16, 2001 (gmt 0)

Jim, pleas describe the nature of your *simple* cgi. Are you actually redirecting, or are you delivering? What are the extensions of the pages you are feeding. If it isn't too proprietary, maybe you could post your script here for review?


 7:48 pm on Apr 16, 2001 (gmt 0)

Any SE could request your page from some unknown (to your script) IP and compare what they receive with what their spider saw from a known IP and determine that you are cloaking. The only defense I know of is constant vigilance, intelligence (Search Engine Spider Identification), and keeping your script's IP list as current as possible.

My experience (several years of using discreet cloaking) is that most major SEs have concluded that cloaking is not a leading source of spam and are generally ignoring cloaked pages that are relevant to the human page you promoting.


 2:12 pm on Apr 17, 2001 (gmt 0)

Thanks for the reply, littleman and DaveAtIFG. I edited my original post with more information that I hope will answer your question. I'm assuming that our reply notification does not work for edits so I'll re-post this. The script is a fairly well known, very basic script, but I don't want to mention the name of it for forum restrictions.

Thanks again,



 4:47 pm on Apr 17, 2001 (gmt 0)

You could make your script a little more discriminating and secure by simply adding a check on the "USER_AGENT."

If ($ENV{'USER_AGENT'} =~ /^Mozilla/) {
Display human page;

Then begin checking your IPs.

It'll probably serve pages a little faster and offer less load to the server too since the majority of visits are non-spider hits and you'll avoid searching the IP list for every hit.


 4:58 pm on Apr 17, 2001 (gmt 0)

The mechanics of the script seem fine. Like you said, it is basic cloaking but it will get the job done. The main thing with this set up is to be
vigilant about IPs and watch for rouge spiders. A $1k script will give you
little more protection.

I do, however, see a potential problem. You are running the index page with the 'cgi' extension. That is likely to get you into some trouble. If you are on an apache server with mod_rewrite you could put 'AddHandler cgi-script .htm' in your .htaccess file to make it treat .htm's as cgi scripts. If you are on NT/2000 talk to your admin to see if he will change it for you. If that isn't an option, then you would be better off calling the script via ssi. Though .shtml may send up some flags, it probably is less obvious than .cgi.

jeremy goodrich

 4:59 pm on Apr 17, 2001 (gmt 0)

You might want to add "opera" as well for the browsers.

Then again, there are many spiders, eg slurp, Ask Jeeves, and recently the FAST research and development spider that can come through with Mozilla user agents.

It's all a matter of nailing the sequence of how the spiders come through, or what marks visitors. For example, the HTTP_REFERRER variable is only from actual visitors, and never from spiders.

Good luck with this.


 5:09 pm on Apr 17, 2001 (gmt 0)

Wow, we are all over this thread. Jeremy, that is a good point for exclusion, but it won't make much difference for letting in -- if that makes sense. I think you have to be careful with exclusion via Mozilla these days. Fast, Ink and others are spidering with Mozilla UAs these days.


 5:50 pm on Apr 17, 2001 (gmt 0)

Ohhhh.. That is a nice touch! Thank You!!!
I love this forum!!!


>If you are on an apache server with
>mod_rewrite you could put 'AddHandler >cgi-script .htm' in your .htaccess file to >make it treat .htm's as cgi scripts


 6:43 pm on Apr 17, 2001 (gmt 0)

To configure IIS to process .htm files allowing scripts:

1. Start up Internet Service Manager
2. Select the web site you want to do the processing on
3. Right click on it and select Properties
4. Select Home Directory
5. If necessary, create an application
6. Press Configuration
7. Select .asp and press Edit
8. Copy down the information in the dialog
9. Press Cancel.
10. Press Add
11. Use the file extension .htm, use all the other information that you copied down in step 8
12. Press OK to all the dialogs
13. Restart IIS for good measure

Global Options:
 top home search open messages active posts  

Home / Forums Index / Marketing and Biz Dev / Cloaking
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved