To Cloak or not to Cloak?

Forum Moderators: open

Message Too Old, No Replies

To Cloak or not to Cloak?

Can anybody report success with it?

chiefmonkey

4:24 pm on Apr 22, 2002 (gmt 0)

Back in May 2000 Brett_Tabke announced the end of cloaking on all their sites.

Is this still the case? Has cloaking stepped up the game against search engines?

I'm thinking about getting some software but before i do i'd like to know that i'm not giong to shoot myself in the head and the foot.

Can anybody report success with it?

cheers

volatilegx

4:28 pm on Apr 22, 2002 (gmt 0)

I can tell you this: I don't cloak from my "main" domains. I only use the technology from domains located on different Class C blocks, with different ownership of the domains. However, I still get excellent results from cloaked websites. I belive the rankings and traffic I'm getting have nothing to do with the fact that the optimized pages are cloaked, however. Probably the reason I'm getting good results is good SEO on the optimized pages.

chiefmonkey

5:33 pm on Apr 22, 2002 (gmt 0)

thanks for the reply volatilegx

do you know for sure that the pages you're having good result with cloaking are definitely pages that you've created yourself or are they ones created by some cloaking software thats generates keyword optimised pages

i must admit i'm still dubious about getting some cloaking software since i've just read these FAQ's on google

[google.com...]

is google just trying to scare us ?

Air

12:42 am on Apr 23, 2002 (gmt 0)

>is google just trying to scare us ?

IMO it's like travelling the highway, there is a posted speed limit but not everyone is travelling at the same speed. Some travel faster and some slower, and even as those exceeding the speed limit pass a police radar they don't get pulled over. Does it mean the posted limit is just a suggestion? No, it is the limit and can be enforced, more so if your car flaunts the fact that it is travelling faster than the speed limit. Real loud exhaust, fat tires, known fast speed abusing car, are all factors that get you noticed for closer scruitiny and maybe penalized. If you are exceeding the speed limit by a large margin then its only a matter of time 'till you get nailed.

It isn't that different with cloaking and SEO in general. Stay under the radar, don't draw attention to what you are doing, don't exceed optimization limits by too great a percentage. Get to know the roads you travel, some tolerate greater deviation from posted limits, do this and you are less likely to attract closer scruitiny and possibly be penalized.

Cloaking alone will not improve you rankings unless you have some design elements that are hurting your rankings. In that case cloaking can help by serving pages to search engines without those indexing constraints. That's is what I am seeing cloaking being used most often for these days.

volatilegx

4:11 pm on Apr 23, 2002 (gmt 0)

do you know for sure that the pages you're having good result with cloaking are definitely pages that you've created yourself or are they ones created by some cloaking software thats generates keyword optimised pages

I created the pages myself. Actually the pages are generated from templates, which I created. I also wrote the cloaking software. Those pages are NOT on Google, however... I don't usually cloak to Google because I find that the entrance page technique doesn't work so well for Google because it is more the off-site variables that get you a good ranking/pagerank for Google.

I think Air made some good points above. Don't expect cloaking to be a magic bullet for your SEO campaign. It's just a tool, more of a defensive measure to protect your HTML from being stolen by competitors than anything else.

sinyala1

8:50 pm on May 3, 2002 (gmt 0)

Yeah, I agree with air. Is it possible for for a search engine spider to catch a cloaked site?

The answer is no. Not if you're using DNS and IP's only. A human can catch you so if you're going on the radar taking a keyword that gets millions of searches a month then ya you might have something to worry about, but no they can't catch you....a human would have to go from their search engine records and then look at what your site offers and even then you could say we did a site redesign. It all depends on how your cloaked pages look or what's on them.

I maybe the only one but when I find a site that when I look at it I see no one in hell this thing could be number 1 and I see it's not a directory listing I'll spoof my IP to the most known spider IP address and then go look at it on my linux box. That'll tell me right off the bat what it is. If you're using agents to cloak your script you should stop because agents can be easily fed to a web page and caught easier.

volatilegx

11:25 pm on May 6, 2002 (gmt 0)

sinyala1

Don't be so confident that you can't be caught. You can... some search engines spider from unknown IP addresses that don't resolve through DNS to one of their domains.

johnser

12:54 pm on May 7, 2002 (gmt 0)

Volatilegx

Any clues which SE's do this?

Mikkel Svendsen

1:42 pm on May 7, 2002 (gmt 0)

As far is I know IP-spoofing is not only a difficult task to handle but also very illegal in most countries. It is not something I would recommend anyone to play around with.

Yes, search engines can do automatic detection of cloaking. I have been working with search engine technology development for some years now so I know this for sure. The question is, do they do it - and who do it?

The problem is that it takes a lot of resources to do the cloaking check. There is never enough development and computer resources available to handle everything you want - so all search engines must prioritize (like any other business). They must focus on where they get results. So doing a broad cloaking detection on the entire index is not something I believe is about to happen.

Cloaking is not the biggest problem for search engines now � general spam and other issues are far more important to keep a high quality index. As far as I see, most search engines focus on how to combat spam � whether it is cloaked or not. Not cloaking in itself. And I think that is wise.

I bet that the first automated cloaking detection systems in production we will see will be systems that check pages that have been detected for possible spam. The cloaking detection would only run on those pages � as reported or found. That will limit the resources needed to an acceptable level and most likely help clean up the worst cloaking spammers.

So the best advise is to stay under the radar � as stated by others. I the case that you are using cloaking I would say stay WAY under the radar! If you cloak you simply do not want to be detected for spamming - any kind, not even something close to spam. If the search engine system detects you as spammer, it is likely that a cloak check will run too a bust you.

But then again, that�s just what I think :)

sinyala1

1:17 pm on May 11, 2002 (gmt 0)

> Not cloaking in itself. And I think that is wise.

I agree with that statement, but I don't really agree that cloaking can be caught so easily. If you mean resources by humans then yes it's possible, by what resources do you mean know they can catch cloaking and how? If general cloak detection was in fact implemented then dns/ip would update the cloaked database leaving the cloaked site safe yet. How do you see possible it to catch cloaking? I'm interested in knowing this.

Mikkel Svendsen

3:53 pm on May 11, 2002 (gmt 0)

I am sorry but I am not allowed to discuss details about how such systems work, as it would violate NDA's I have signed with some clients. Hope you understand :)

But I am allowed to tell you that it is indeed possible to almost fully automate such detection but it takes a lot of computer resources - much more than the present value of a completely cloak free index.

At least that's my opinion based on experience. I can't be sure how every search engine handle this but I am certain that most of them will begin to do automatic cloaking detection (and banning) if the problem becomes to big. Right now it seems like there are so many other factors that can be tweaked much cheaper � and with a better results, so I honestly doubt that any engine is doing this across the full index and all updates today.

sinyala1

4:00 pm on May 11, 2002 (gmt 0)

Sorry to say but I don't think that's truthful if you say you can catch all cloaked sites. It's all server side. If you're redirected to a dafault.htm or html page HOW is it possible for a search engine to find out that it's a cloaked site? By checking it's databases against....a human viewer? HOW is it possible to detect this? You request web page, server side scripts tell the server what to do, you get sent to page requested. Since this is the index page and you could be sent to an actual index.html as the search engine spider how is a spider gunna tell it's cloaked? By trying different DNS/IP's? Spider traps would get all this and then be updated.

jeremy goodrich

4:26 pm on May 11, 2002 (gmt 0)

My experience has been the opposite of yours, Mikkel...no engine that I have worked with has had such a system, or the resources to develop one, much less even begin to research how it would be done.

The way I look at any engine trying to catch a cloaker, the massively auto generated, spammy, only for search engine spider pages kind of cloaking, is that you just can't do it in automated fashion (unless you use shadow IP's or spoof) because of the nature of the pages: they aren't magic.

They are regular, normal, HTML pages and in fact, aren't very special. It's just that after having so many pages ripped (there was that one on Excite that everybody had to rank well so long ago....) it gets a little frustrating giving the search engine and the user the same exact code / keyword combos / layout, etc.

In the end, I would say cloaking is a lot like Air said: be careful, but of course, it's always a good idea. Just don't expect it to be a magic bullet, or cure all to get better rankings.

If you don't know how to cloak already, I would say regular optimization with no cloaking needed will get you what you want (with Google, anway) though I would think about cloaking if you have special HTML you dont' want people to rip off.

sinyala1

4:37 pm on May 11, 2002 (gmt 0)

I agree. I don't see the possibility of it being able to catch a cloaked site. This is server side here where the server tells where the requested web site is. I asked 5 programmers (guru's) is it possible and they all said no after looking at my source code. Some came back with ideas but it would still not be possible.

Mikkel Svendsen

4:42 pm on May 11, 2002 (gmt 0)

"not possible" is something lazy programmers say when they don't know what to do or don't want to do it ;)

sinyala1

4:57 pm on May 11, 2002 (gmt 0)

It's logic not laziness. If you can submit one idea in a broad range such as finding out by DNS changing etc. I would gladly consider it, but you don't have any reasoning of how this is possible.

jeremy goodrich

5:04 pm on May 11, 2002 (gmt 0)

(most of this is a bit off topic.....but.....)

If 10,000 servers (google) can't do it...what engine has more computing power?

And, I'm sure you recall the infamous Inktomi debacle, Mikkel. I worked for firms that were on that list. It was compiled by hand and not by spiders and as far as I've seen in the US market, they have been the most 'gung ho' in hunting down massive cloakers.

Aside from that, I'm sure you are aware many, many SEO's have software which build pages on the fly, from datbases of related content, to produce human readable sentences, phrases, and entire sites, complete with links, which seem a little garbled, BUT, couldn't be determined when spidered if it was written by a person or software.

Now, aside from the AI implications of such a system, if they really had some 'intelligent spidering methodologies' then I would think they'd sell that, and get out of the search engine business ( again, that's something, judging by most US corporations, that doesn't look too profitable).

Getting back to topic, to cloak or not to cloak, I for one don't do it for Google anymore. Last time I did it, the company I work for made plenty of money, and the pages never got kicked out (they're still in there, 4 months later) but the ROI isn't nearly as great as a site which is well optimized from the ground up, and has both spider and human content.

The more content you have on a page, the better, for Google, because of the full text indexing that they do...and the cloaked pages I always built didn't have many 'fuzzy keyword combos' which can pull in a lot of bonus traffic.

4eyes

5:09 pm on May 11, 2002 (gmt 0)

Sinyala1,

It is extremely simple for them to automate cloaking detection.

At its simplest level, they can hit the site with 10 spiders in quick succession. One, or more of these spiders comes from a new and previously unused IP.

If the 10 sites returned are not identical, then its cloaking.

(edited by: 4eyes at 5:12 pm (utc) on May 11, 2002)

sinyala1

5:12 pm on May 11, 2002 (gmt 0)

LoL, that's why there are things called spider traps that set off every kind of flag as possible that are submitted weekly in groups of 20. That's one spider trap submitted every 2 days all cloaked.

4eyes

5:22 pm on May 11, 2002 (gmt 0)

Think they don't know that?

If they just choose to hit a shortlist of sites in the database and ignore your regular submissions (like so many do already) your spider traps aren't in the game.

The cloak test spidering can be run idependantly from normal spidering and if they deliberately limit it to a small section of the sites in the domain by the time your spider traps are triggered a large number of your other sites could have been decloaked.

But this is still just over simplification - if they want to automate cloaking detection they can do it.

The question should be 'why aren't they already doing it energetically'?

The answer might be 'Because many of their major sponsors sites use cloaking extensively'

lazerzubb

5:27 pm on May 11, 2002 (gmt 0)

If 10,000 servers (google) can't do it...what engine has more computing power?

OFF TOPIC AGAIN.

You think Google have all that computer power, think again, sure they have many computers, but they said themselfs they had problems with storage, and that doesn't sound good.
I think they focus more on a bigger index than a few sites which is cloacked.
And i agree with Mikkel, i don't think it's that hard for the SE's to find cloaked sites.
And there is so few out there who uses it that it isn't THE big problem.
Remember sometimes the things you think is very hard to do, might be very easy.

(edited by: lazerzubb at 5:31 pm (utc) on May 11, 2002)

sinyala1

5:31 pm on May 11, 2002 (gmt 0)

You're talking about a really short list then and they would have to be not in the same building or on the same internet service provider as google's search engine is using because the dns would show up the same. It's not hard to probe the net for DNS from google involving google's se's or their offices. Find it then add it. So you're talking about an entirely new branch (internet connection) of the place trying to find cloak sites.

sinyala1

5:41 pm on May 11, 2002 (gmt 0)

hmm...to my knowledge google is scanning or running beta tests on finding cloaked sites as we speak. They'll find em, but not by usual ways. If they do probe sites from multiple IP's then they would also have to probe sites that are throwing flags or showing up to certain criteria. This might be just the topped ranked sites or...? I have no doubt that the search engine company can catch a cloaked site but I don't believe the search engine itself can. It has it's own purpose. I also have a good feeling that there is a blacklist for cloaked sites. I purposely exposed a cloaked site and it was banned from all search engines. I do mean all, on the same day.

WebGuerrilla

5:51 pm on May 11, 2002 (gmt 0)

It really isn't that difficult to set up an automated system to catch cloaking. All you need to do is run some spiders from an IP block that no cloakers would have on their list, and then do some comparative matching.

How successful would cloaking be if a major engine was running shadow bots through a block of IP's they leased from AOL? And what if that bot was using actual SERPS as the starting point of its crawls so it was passing keyword referral information to your log files? Is anyone really going to add an AOL Class C to their IP list and begin serving all their cloaked pages to anyone using AOL?

For an engine like Google, it's even easier because they can easily isolate a very small subset of their database to focus their efforts on. GoogleGuy has mentioned that they have determined that 96% of the pages in their index that are using the noarchive tag are doing so to hide spam. Within that subset, you'll find 99% of all cloaked pages. Since the overall percentage of noarchive pages is relatvely small in comparrison to the size of the complete database, it wouldn't be too difficult of a thing to set up.

However, as Mikkel pointed out, it is just a matter of practicality. In the big picture, cloaking makes up a small percentage of search engine spam. A search engine will get much more bang for their buck by focusing on all the other spam techniques that are out in the open for everyone to see.

Of course, that doesn't mean that these types of tools don't exist, it just means that they are only used to target high-profile, keyword niches that are known to have a higher than normal ration of spam.

sinyala1

5:56 pm on May 11, 2002 (gmt 0)

My logs are automatically added so yeah, an AOL address would be added regardless and then it would be easier to add them to my spider trap as they all use noarchive.

4eyes

5:59 pm on May 11, 2002 (gmt 0)

It would seem foolish for them to use any IP associated directly with them.

Clearly you think they cannot check automatically, and I am unable to shake this belief, but automatic checking is not the main threat.

The main threat is being 'turned-in' by your competitors and triggering a manual inspection. People will 'turn you in' for cloaking just because you are above them and they don't know how you did it.

Cloaking a site maybe OK for short term gain, but if I am investing time and money in building link popularity etc, I don't want to risk losing that by using cloaking and getting caught.

Sure, cloak your paid pages on gateway sites - but I think the increasing dominance of Google makes the return much less.

For most sites, the effort is no longer worth the risk.

sinyala1

6:05 pm on May 11, 2002 (gmt 0)

I agree. All you'd need to do is either IP Spoof or type in the website, get the page's of the website listed on that engine, then spider the site and see what page's you come up with. If they don't match, well there you have it. It's cloaked.

startup

6:27 pm on May 11, 2002 (gmt 0)

sinyala1,
Are you feeding every spider the same page? If you are, this is very hard to detect. If not, what are you feeding spiders from Exodus?
Cloaking done correctly will not get you banned. The catch is, how do you find out how to do it correctly.

sinyala1

6:45 pm on May 11, 2002 (gmt 0)

Well, I only do it upon client request and tell them they can get banned for this and sign a waver but yeah, same pages for all engnes. Of course different pages are ran through different algoritm and cloning techniques but yeah, they're all the same. I disagree there, cloaking done perfectly will get you banned. I've caught people cloaking many of times. Ever see a site and wonder how the hell it got there? Check the search engine's database then check the site. I don't report cloaked sites but if I was someone else they sure could report the found site or even my cleints. Cloaking a perfect site (meaning perfectly optimized pages) will probably put you further more in danger.

You find out and get better by experience and knowledge of it. I've got a few thousand domains to do what ever I really want to. Most will never be on search engines nor want to so I submit them and cloak them. That's how I tested cloaking for a year before I actually did it for a client. When my database got to about 2,000 IP's of only spiders (realize these index page's had a un-known lanugage like words like kjdsf#@$#dkjfe43) that no one would ever search for I knew that it was spiders or web surveyors going off search engine databases and my spider list was solid. I began to get hits in the thousands from certain IP's alone (no search engine IP will hit a site that many times). In fact they're now increasingly growing of just IP's with no DNS hitting my spider traps. Since these are cloaked spider traps and I make it very easy to tell if it's cloaked because I make them go off agent names then throw them into a seperate category. Meaning agent names detected thrown into another category of my database of spiders. The rest go into a IP database category. All my main sites go off dns and ip's alone that I really cloak for optimization, but when you can report yourself to a search engine saying you're cloaking your site and have them inspect it and have them come back and say, "We think this is a legit site" then I guess you can say you're doing cloaking the right way, but I don't know if that's possible. I've only had it happen once when I reported one of the sites I cloaked and had them come back saying it was legit after they inpsected it for cloaking.

sinyala1

6:59 pm on May 11, 2002 (gmt 0)

Advice on not getting caught by users finding your cloaked site:
The first line of code on your cloaked optimized page should be a spider checker to run through the database and find out if this a person or a spider seeing this page. Make it server side and make it show a different page that looks legit.
Just like cloaking an index page to show a different site you can do the same for every optimized page. Have it print a page on the optimized search term that would make sense.

IE. you click my URL to a cloaked page. You go there and script prints out one of the page's on the website instead of you seeing the cloaked page. Looks legit now.

This 31 message thread spans 2 pages: 31