| 7:36 am on May 19, 2004 (gmt 0)|
Surely I can't be the only one seeing this?
| 7:46 am on May 19, 2004 (gmt 0)|
Umm, Google cloaks... IP Delivery, Geo-Targeting, whatever you want to call it. If Google is ranking those despicable sites well it means one of two things:
1.Their algo needs work.
2.Google has a high "opinion" of those nasssty despicable sites.
Cloaking isn't a magic bullet. Work on better content, more content, more links and better links. Or worry about your competitors cloaking their sites. One route has an ROI, the other...
| 8:00 am on May 19, 2004 (gmt 0)|
six weeks back I had reported a cloaked site to GoogleGuy. There was no response from him, or Google, and the site is still there.
we have seen from the past that Google mostly implements all changes at teh algo level, and possibly this is what they are trying to do; and it is taking them some time to achieve it.
another answer is that they are so busy with the IPO, they aren't bothered about this anymore.
| 8:04 am on May 19, 2004 (gmt 0)|
I think that is it more of a problem in Yahoo.
One think that I don't understand is how cloaked pages can rank high with so few incoming links.
| 8:05 am on May 19, 2004 (gmt 0)|
How are you determining the number of inbound links?
| 8:51 am on May 19, 2004 (gmt 0)|
There is only one way in which Google could clear the cloaked rubbish, and it strikes me as odd that they haven't done it. For 1 week, or 1 month, rename Googlebot to Mozilla/4.0, then pass those results through their spam-mincing machines. Then carry out the 'threat' of banning sites that use cloaking. One small action, quality of search results improved.
Does this make sense, to anyone at Google?
| 9:20 am on May 19, 2004 (gmt 0)|
One of my sites is in competition with a site that relies on this hideous technique to generate revenue.
Cloaking can only provide the user with something that they really were not expecting and will make them think twice about visiting that URL again. Repeat users are absolute gold if not platinum.
I think that Google will weed the cheats out of the index sometime in the future. Google does pride itself on relevant results.
Think long term and don't do it.
| 9:56 am on May 19, 2004 (gmt 0)|
Pretty much all dynamic websites cloak - ie, different content to different customers.
The problem is that Google needs to determine whether they are cloaking because of a functional requirements or to spam search engines.
The former being OK, the latter being not. Unfortunately, this is not easily detectable by an algo.
| 10:02 am on May 19, 2004 (gmt 0)|
|another answer is that they are so busy with the IPO, they aren't bothered about this anymore. |
very much doubt that the techies working on the algo are the same people preparing the docs for the IPO.
| 10:14 am on May 19, 2004 (gmt 0)|
The basic text content of cloaked and non-cloaked pages should be the same. This can be tested by algo, but it does use CPU power.
It seems to me that all search engines should read a percentage of pages anonymously and perform this cloaking test. The real issue is what do you do when a problem is detected. I don't like the idea of pages being banned automatically by computer algos, but, I suspect, there would be so many pages out there that are cheating, human checking would be impossible.
It's also worth noting that anti-cloaking engines would need to change their IP addresses at least monthly.
| 10:16 am on May 19, 2004 (gmt 0)|
>>this hideous technique to generate revenue.
No, no, no! There are evil people in this world, but there's no such thing as an evil, hideous technique. They may be using it either to generate revenue or to protect their revenue. Over-generalization can be not only unfair, but a dangerous thing.
>>Cloaking can only provide the user with something that they really were not expecting and will make them think twice about visiting that URL again.
Again, an over-generalization that could be very misleading. If people don't get what they were looking for they won't visit again, but they just may get exactly what they were looking for and expecting, which may not be what search engine crawlers are currently capable of handling with current technology.
>>Think long term and don't do it.
People thinking long term have to pay their mortgages long term, and that can sometimes mean protecting their assets.
There's good and there's bad in everything, and moralizing may have its proper place, but when it come right down to facing the facts of reality there's no place for or accuracy in judging by broad generalizations.
How about let's stay with the original topic here, which is:
How are cloaked sites achieving high rankings?
| 11:01 am on May 19, 2004 (gmt 0)|
|The basic text content of cloaked and non-cloaked pages should be the same |
What if you geotarget and change the language depending on the IP address?
| 11:04 am on May 19, 2004 (gmt 0)|
|How are cloaked sites achieving high rankings? |
by serving up spam infested filler pages to the bot, and something else to the user, thought you'd have been able to answer that yourself Marcia?
|What if you geotarget and change the language depending on the IP address? |
but googlebots have the same geo-IP address?
| 11:10 am on May 19, 2004 (gmt 0)|
I think the idea was that they'd use different IP addresses. If they simply used an IP address from the same block all the time, then you would just have to cloak on that block.
Anyways, there are many many good reasons to cloak that the search engines are completely fine with .. which is why you don't automatically get booted for cloaking.
| 11:27 am on May 19, 2004 (gmt 0)|
>very much doubt that the techies working on the algo are the same people preparing the docs for the IPO.
Shak, I think he means that the executives and therefore the company policy (including the use of technical resource) may currently be geared towards the IPO and shorter term objectives at the moment.
As for cloaking well I thought it was considered unethical to present one thing to a spider and something different to a user. Unless web dsigners know better than the google algo of course, which we all do!
I also don't understand why google doesn't send out a cloakbot under a disguised agent and anonymous IP.
| 11:42 am on May 19, 2004 (gmt 0)|
Many webmasters are also aware that Google's Top 10 is full of sites using doorways, hidden links and link farms.
I've been amused by looking at URLs such as [widgets-word1-word2.com...] in the Top 30. And what about [widgets-word1-word2.com...] in the Top 50.
John Stossel from ABC News would say "Give me a break!"
We'll have to wait how Google's competition deals with the above issues ...
| 12:02 pm on May 19, 2004 (gmt 0)|
How about narrowing it down related to the original topic. The title of this discussion:
|How can cloaked sites be ranking well at Google? |
Assuming that off-page factors such as Page Rank and keyword rich anchor text in inbound links heavily influence ranking well with Google, how does that figure in with cloaked pages showing up doing well in the SERPs?
[edited by: Marcia at 12:40 pm (utc) on May 19, 2004]
| 12:34 pm on May 19, 2004 (gmt 0)|
Marcia, the original post didn't merely ask how cloaked sites are ranking well at Google; it included the statement:
|My clients are begging me to implement cloaking scripts on their sites aswell as that is the only way I can think of to fight it. I don't want to, I think spam cloaking is dispicable but Google aren't leaving me much option. |
So discussions of whether one should or shouldn't cloak (and whether Google is likely to do anything about it) certainly aren't off-topic.
| 12:40 pm on May 19, 2004 (gmt 0)|
Point taken Marcia! I only had time to generalise.
Do you use cloaking to improve your rankings in search engines?
| 1:00 pm on May 19, 2004 (gmt 0)|
Ah, this brings back memories [webmasterworld.com].
whiterabbit, the days when putting keywordX into your page hundreds of times to get higher rankings than 'normal' content about keywordX are long gone.
internetheaven, it's really very simple. The types of pages you're talking about are made by people who are paying attention to search engines. Those people also attempt to tune their titles, content and anchor text for phrases. There are very many pages in Google, and for any given phrase somebody has to come top.
You wonder if you can come higher for the most competitive keywords your clients wish to rank for, simply by delivering different content to Google from the content you want visitors to see? Well, I suggest you make that page that would get you to the top, then look at your human friendly page and see which aspects you forgot first time round.
Is cloaking useful? Yes. In some cases it is obnoxious (the IP delivery used in some local Google domains has been an annoyance to users), but it makes sense to cloak pages for robots if you wish to save them from session ids and other temporary URLs, etc. As for hiding content, if it's relevant why bother to hide it?
| 1:15 pm on May 19, 2004 (gmt 0)|
| 2:27 pm on May 19, 2004 (gmt 0)|
|the days when putting keywordX into your page hundreds of times to get higher rankings than 'normal' content about keywordX are long gone. |
The days of scraping serps, from 3 or 4 se's, jiggling them, then serving that page to the bot and another to the user are here for a while ciml.
When I say spam, i don't mean 1990's style spam, I mean 2004 style spam.
| 8:36 pm on May 19, 2004 (gmt 0)|
|How about let's stay with the original topic here, which is: How are cloaked sites achieving high rankings? |
Thanks Marcia, I think we all know that Google can't just ban cloaked sites across the board, there are too many variables on cloaking and completely legitimate uses for it - some which I use on several of my own sites.
I wasn't pointing out that cloaked sites are in the index, I was stating my amazement at how they are getting high rankings.
|by serving up spam infested filler pages to the bot, and something else to the user, thought you'd have been able to answer that yourself Marcia? |
Well that can't possibly be true. Regular spam 'infested' pages are not ranked highly. Keyword stuffing just doesn't work anymore.
|Many webmasters are also aware that Google's Top 10 is full of sites using doorways, hidden links and link farms. |
I disagree, I think they've done very well in filtering through hidden links and link farms. I think they give too much emphasis on internal linking though as sites that build 1,000 pages just pointing internally anchored to one Affiliate Network Spam page seem to be ranking highly right now.
|internetheaven, it's really very simple. The types of pages you're talking about are made by people who are paying attention to search engines. Those people also attempt to tune their titles, content and anchor text for phrases. |
I realise that your statement was made well down the line in the discussion so I'll repeat my original message (though in future I recommend you read through a thread before posting):
"how come cloaked sites (the pages which are fed to the crawlers) have poor inbound links, low quality content, almost non-existent internal linking structure and yet they rank at the top? In my opinion, the pages that the cloaks feed to crawlers shouldn't rank highly even if they WERE the actual pages users were seeing!"
So you're derogatory statement shouldn't have been aimed at me. The pages (like I said, the top 20-30 in some instances) are poorly optimized, hardly ranked to, not internally themed or anchored either and yet they are ranking high. The only factor is the cloaking, for some reason it feels like Google is giving extra marks for cloaking!
| 8:56 pm on May 19, 2004 (gmt 0)|
|The only factor is the cloaking, for some reason it feels like Google is giving extra marks for cloaking! |
It may feel like this, but it is highly unlikely. In fact, it is sufficiently unlikely that it can be disregarded as an explanation for the SERPS you're seeing. If you look hard enough, you'll quickly find a more likely explanation.
| 9:24 pm on May 19, 2004 (gmt 0)|
"Don't employ cloaking or sneaky redirects."
pretty straight forward, I thought.
As for the competition doing it?
Do you want a time bomb sitting in your income stream. At any time google can decide to remove these sites. Where will you be then? Hopefully at the top with us.
For now, look at the cloaked content... see why it is preforming so well... and implement similar content style in a user-friendly manner.
| 10:19 pm on May 19, 2004 (gmt 0)|
|For now, look at the cloaked content... see why it is preforming so well... and implement similar content style in a user-friendly manner. |
I'll say it again, I'm looking at the content and that is why I am so amazed. It is incredibly lacking in good content, stemming, on topic material, inbound/internal links, anchor text etc. etc. etc. at best it can be described as a four-year-old-spam-style-doorway-page.
That's what I mean, I'm looking at the content/optimization and it's terrible. (Before anyone says it, I'm a good SEO so when I see basics missing.)
| 10:50 pm on May 19, 2004 (gmt 0)|
Fair enough. I guess its just the poor performance of the algo then. Not because of letting cloaking through, but for ranking the cloaked content well.
| 11:41 pm on May 19, 2004 (gmt 0)|
internetheaven, I share your confusion. Today alone, I found 18 sites which had absolutely no business on the 1st page of the SERPS. The majority were cloaking ... and not in a way which could possibly be deemed acceptable by a jury of our peers. They were blatantly cheating and no explanation could excuse their abuses.
Next to no content - links with keywords like "Directory", "Catalogue", "Links", "Products", etc. which went absolutely nowhere! Very few inbound links which were from anything other than their own sites ... yet there they are, right up there with several other well deserving fraudulent directories!
To whomever said that the days of keyword stuffing are long gone ... I beg to differ. Today (at least), it is working really, really well!
I found a cloaked site today (#1 of course) which delivered entirely different content to the search engine than that which the user sees. The text the search engine was fed was totally keyword stuffed! Heck, the webmaster didn't even bother trying to mimic the page the user sees. He just put up a text file using lots of H1 tags and a string of keywords followed by a couple of nonsensical paragraphs followed by another string of keywords.
I found another site (also #1) for a relatively competitive search term which was not cloaked and read like:
If you plan to take a keyword keyword keyword keyword vacation in the keyword keyword keyword, our keyword keyword keyword keyword will be ideal! Consider using keyord keyword keyword keyword ... and so on.
AND, I'm talking about the very same keywords being repeated over and over and over again! If I owned that site I would be thoroughly embarassed.
This particular site had been banned or penalized for the better part of the past 6 months following the Florida update. All of a sudden, it and many others have made a reappearance.
I am not going to panic just yet. I am praying that this is just a minor and temporary glitch which will be fixed soon! Perhaps I'll take a couple of days off and wait for the dust to settle. :)
| 11:50 pm on May 19, 2004 (gmt 0)|
okay, forget theory and algos for a second.
heres what happened in real life for "xwidgetx.com" as the widgetized example. whois data shows the domain was registered jan/2004. they came on like gangbusters last week, so maybe they just got out of the sandbox. in the meantime they were busily feeding pages to the engines.
for the search "xwidgetx" google shows *27,900* results from subdomains like consulting.xwidgetx.com, cars.xwidgetx.com, you get the idea. each page is stuffed with words and phrases scraped from other sites. it appears as a side effect, that if enough of your page gets included, you drop like a stone.
so, going over to yahoo, msn, altavista, gigablast to do the "xwidgetx" search yields no more than *15* results on each engine. most do not show any pages as results directly from *.xwidgetx.com, but are mentions of xwidgetx on pages at other sites. most of the engines yielded no more than *10* results, *15* was the max. all sites returned at least *some* results for "xwidgetx". obviously the other majors have the situation under control for xwidgetx.com, as they know about them, but are severely limiting the results.
a spam report got the usual "yada, yada, yada ... the quality of the search results is important to us"
a workable, albeit unpatentable algo for spam inclusion goes something like this:
a/ someone files spam report about xwidgetx.com
b/ trained monkey looks at xwidgetx.com manually
c/ trained monkey inserts xwidgetx.com into spam db
d/ indexing proc runs algo: if (select spamflag from spamdb where domain=thisdomain) then drop() else keep().
anyone here not able to understand the algo?
netcraft is reporting 45+ million active domains, surely a database this size is within reach of google. imdb is around three times that right now in a just the movie titles they index.
might even create a few jobs outta this for non PHD's :)
| This 41 message thread spans 2 pages: 41 (  2 ) > > |