| This 129 message thread spans 5 pages: < < 129 ( 1 2  4 5 ) > > || |
|Cloaking Gone Mainstream|
Languages, Agents, Doc format, - cloaking is everywhere.
Cloaking has taken on so many new meanings and styles over the last few years that we are left scratching our heads as to what cloaking really means. Getting two people to agree on a definition is nearly impossible with all the agent, language, geo targeting, and device specific page generation going on today. It is so prevalent, that it is difficult to find a site in the Alexa top 500 that isn't cloaking in one form or another.
This all came up for us in mid December when right at the height of the Christmas ecommerce season, a friends European site was banned or penalized by a search engine. After numerous inquiries, it was learned that the surprising reason for it was cloaking. I got asked to take a look at the site and figure out where their was a problem. The site owner didn't even know what cloaking was, let alone practice it.
I determined that his off-the-shelf server language and browser content delivery program was classifying search engines as a text browser and delivering them a text version of the page. In it's default configuration, this 5 figure enterprise level package classified anything that wasn't IE, Opera, or Netscape as a text browser and generated a printer friendly version of the page that was pure text.
We explained to the SE just what the situation was and they agreed agreed and took off the penalty after we said we'd figure out a way around the agent part. Unfortunately, the package had all but compiled in the agent support and they were surprised when we informed them about it. What was even better was looking around some fortune 500 companies that run the same software to find three entire sites that were in effect "cloaked" - they didn't have a clue.
In the end we solved the problem with another piece of software that would exchange the agent that the site delivery program was seeing. Yep, we installed cloaking software.
So lets have a little run down of the current state of cloaking in it's various forms:
We've talked a bit about about agent based cloaking recently [webmasterworld.com].
Search Engines Endorse Web Services Cloaking:
Cloaking has become just varying shades of gray. We now have instances where search engines themselves endorse cloaking (xml feeds) and in some instances are giving out cloaking software to deliver those xml feeds.
That has resulted in pages intended (cloaked) for one search engine being indexed by another search engine. There have been occasions where this endorsed content has been banned or penalized by another search engine.
Geographic IP Delivery:
Language translations have been a hot topic for the last year. Most major sites now geographic deliver content in one form or another. Hardly a month goes by when someone doesn't scream, I can't get to Google.com because they are transparently redirected to a local tld. You will also find those same search engines custom tailoring results for that IP address (eg: personalized content generation). You can see the effect your self by changing your language preferences on a few search engines that offer the feature.
One Browser Web:
The recent history of major browsers is summed up in IE4-6, and Netscape 3-7. There is also a large 2nd tier of browsers: Opera, Lynx, Icab, and Mozilla.
All of these agents support different levels of code and standards. They also have inherent bugs related to page display. If you are a web designer, you could get a degree in the various browser differences of CSS and HTML alone.
Just when we are starting to think in terms a one browser web, along comes a whole new set of browsers to consider: Set Top Boxes, Cell Phones, PDA's, and other Mobile Devices. These all have varying degrees of support for XML, XHTML, CSS2/3, and the web services protocol blizzard (eg: .net, soap...etal).
We've not even begun to talk about IE7 which is rumored to be in final internal beta testing. Then there is Apples new browser and the growing horde of Mozilla based clones. When you put it in those terms, our one browser web seems like a distant dream.
Delivering different content to these devices is a mission critical operation on many sites. Generating content for mobile devices is a vastly different proposition than delivering an xml feed to a search engine, or a css tricked out page for a leading edge browser.
Given that the combination of vistor ip and user agent can run into hundreds of possibilities, the only valid response is agent and ip cloaking.
Off the shelf cloaking goes mainstream.
There many off-the-shelf packages available today that include cloaking in one form or another. The perplexing part is that many sites are cloake in ways you wouldn't even know about. There are several major forum packages that cloak in some form or another.
I was at a forum this morning that was agent cloaking, and other that was language cloaking. In both cases, the webmasters don't even know that it is taking place - let alone have the tech knowledge to correct it.
Welcome to 2003 - Modern Era Of Search Engines.
This isn't the web of 98-99 where people would routinely get whisked away to some irrelevant site unrelated to their query. Todays search engines are vastly improved with most engine algorithms putting Q&A tests on every page they include. Those range from directory inclusion requirements, inbound link count and quality, to contextual sensitivity and even a pages reputation.
In this modern era where search engines now routinely talk about their latest off-the-page criteria algo advancements, it's clear that traditional se cloaking has little effect. It comes down to one simple fact, those that complain about SE cloaking are simply over looking how search engines work. The search engines have done a fantastic job at cleaning up their results programatically and by hand.
The most most fascinating thing about this new main stream cloaking is the situation where a site just classifies a search engine as a graphically challenged browser. In that case, cloaking becomes mostly a agent based proposition. The trouble starts when you throw in language delivery to the equation, or even delivering specific content as part of a search engine program.
All of these wide ranging factors combined to result in about 10 to the 4th power of page generation possibilities. In that situation, it almost becomes a necessity to put spiders into the all text browser category and deliver the same page to the se's that you deliver to cell phones or the Lynx browser.
Thus, we've come full circle on search engine cloaking. We no longer cloak to deliver custom content to search engines, we now cloak for the search engines to keep them from getting at our cloaked content for visitors.
<edit> cleaned up some typos and syntax errors</edit>
[edited by: Brett_Tabke at 6:15 am (utc) on Feb. 3, 2003]
One should not judge any member by the PR of the site/sites in their profile or their post count.
Judge the member by the quality of his/her posts.
Since he brought up "bank account", I'd be willing to bet rezone makes more $ on SEO/Traffic Management/Site Marketing or whatever you want to call it than anyone who has ever posted here @ WebmasterWorld.
|This thread really isn't about whether it is ethically, or morally "correct" to cloak, it is about the definition of cloaking. |
|is about the definition of cloaking. |
OK... so, given that (last I checked) it takes years for a 'new word' to make it into any official English Language dictionaries, and (last I heard) there isn't a single person in this thread on the appropriate committees with either Websters or Oxford, it would seem most appropriate to allow the people engaged in an activity to define it.
The new computer/internet terms making it into the official lexicons now were created and defined by computer users, not modern-day Luddites. Hundreds of years ago the terminology involved in brewing was created and defined by brewers, not teetotalers. By that long-standing tradition, it would seem that the cloakers ought to be the ones to decide what the word meant...
And in this thread it seems the cloakers are all pretty well in agreement.
I tend to agree with where Jill and Alan are coming from.
From the Search Engine's point of view, when they say "don't cloak", they are really saying "don't make us read content that you don't show to humans...just like invisible text. That would be an attempt to trick our alog" When MSN-UK serves different content, it is really GeoIP based, my point further described below on this…
I think the confusion is arising as everyone does not seem to agree whether cloaking is a 'Technology' (in this case - technical ability to serve different content to different people/UA/IP) or 'Application' (what this technology is used to achieve)
If Cloaking is viewed as 'Technology' (ability) then it is not a bad thing in itself. I think majority of resentment towards Jill and Alan is coming from people with this school of thought, who view cloaking as a technology. I think Brett described his opening discussion in this light.
If Cloaking is viewed as 'Application' then we have 2 scenarios here -
1. Use it to do GeoIP, UA based content delivery to Humans - I think there are no contentions on this. This is fine.
2. Use it to do GeoIP, UA based content delivery to SE Bots - I think this is the controversial part. I personally believe that this case is used to serve spammed content to SE – what site owners do not want to show to humans – just like invisible text white-on-white.
Issue is, what you think ‘Cloaking’ is and it really depends on which side of the fence you are talking from and whether it is ‘Tech or App’. SE sure frown IF you use it to sniff their Bots. Webmasters find it a great way to serve regional or UA based content to their users. Some SEO experts abuse it to get high ranks.
I think Alan and Jill seem to have passed the verdict that Cloaking is defined as ‘Application, #2’ of my description above.
Is there a reliable source or authority, which has defined ‘Cloaking’ somewhere yet? Is it ‘Technology’ or ‘Application’; if Application, then #1 kind or #2 kind? I guess getting hold of an authoritative ‘definition’ would clear some clouds. Google’s definition would surely be from their point of view.
This thread also seems to have a powerful, emotional subtext -- how various people relate to "rules" and "authority". And the issue really is about flexible and evolving rules, not absolutes. These particular rules are set by search engines as a de facto authority, mainly by the strength of their influence.
1. One kind of emotional approach is "Tell me what your rules are, and I'll obey them."
I don't believe any adult who claims they operate under approach #1 -- it's the mode of a child.
2. Another approach is "Tell me what your rules are, so I can figure out what I can get away with."
I don't trust people when I discover that they operate under approach #2 -- it's the style of a rebellious and still self-focused adolescent.
No one has a right to expect me to operate in either of those first two modes. There's a third, more responsible, non-victimized and non-victimizing position:
3. "OK, you've got the power and you're laying down rules? Tell what they are -- I will make an informed judgement about what I choose to do."
It's crazy to think that all forms of Search Engine Persuasion will ever vanish. I'm certainly not going to be passive about where my sites show up. And it's similarly crazy, IMO, to turn the arena of the rules d'jour into a battleground with the labels of right and wrong.
When we allow that to happen, then we open a door for some people to exploit others with their brick-bats of morality. There are always wolves looking to turn others into sheep, and this is one of their tools.
Have you ever noticed this in life? The very people who speak the loudest about right and wrong often have a closet that they don't want anyone to open up. The moral high ground attitude doesn't fly with me -- I've seen too much.
As I said, I don't believe any adult really operates under Approach #1.
SeoRank, I think you're understanding exactly where Alan and I are coming from. (Thank goodness, I was starting to think we spoke a whole different language!)
And Tedster, I don't think this thread or Alan's article is at all talking about what is right or wrong, or moral or not. (You may or may not have noticed that he didn't once mention the word spam in it.) It's simply a way to help people understand that there are many things that some people call cloaking, that are actually perfectly fine to do, that won't get your banned or penalized, and have nothing to do with SEO.
Brett talked about all the mainstream applications in his first post here. Those are all wonderful things that can help Webmasters make their sites be the best they can be. But by calling them cloaking, it unfortunately muddies the waters and makes people then think that cloaking in general is okay. The things Brett talks about are fine by the search engines, but true cloaking (as Alan has defined it) is not fine because it's an attempt to trick the search engines. And I'm not saying that it's not fine by me, because I could care less if anyone cloaks. I'm talking about from the engines point of view. Most of them don't believe it's okay because true cloaking is an attempt to subvert their ranking algorithms.
Now of course, this whole argument would be a lot easier to swallow if the silly engines would all agree upon what cloaking is and isn't also. I don't think we will get anywhere with this definition because they don't/won't. It seems they feel a bit guilty because some of them are happy to take payment to allow cloaking-like activities, so they don't seem to want to say right out straight, "don't cloak." Google does take the lead on this, which is why they happen to have the best results, imo.
I pretty much agree with what you have said, and do you think that, (I am absolutely not passing moral judgment here) we must own up to the fact our primary reason for this is whole SEO thing is profit. That each and every one of us is trying to one-up the other to gain a personal advantage?
It seems to me we are all here to make a living. I know in my original naivety here at WW I had the impression there was some kind of great good I had to adhere to because of the collective nature of the web and it's analogy to the collective of all people. Forget that. This is commerce plain and simple. We will test every 'rule' to see just how far we can stretch it to meet our needs.
I always find it humorous when someone that has never cloaked a site tries to define it. This thread has been hilarious.
Jill, all you ended up doing was agreeing with everyone that "said doesn't matter what you think, it is the engine's definition that we have to live with.
>>but true cloaking (as Alan has defined it) is not fine because it's an attempt to trick the search engines
What? I don't think there are many folks out there that cloak to "deceive" the engines, what's the point? Clients that hire someone to cloak their site aren't doing it so they can deceive the engines or deceive the surfer. 99% of them want to cloak a site to overcome design limitations.
If someone hands me a multiple framed, heavily scripted site and they want better ranking they don't want me to send the surfers to pages about apples while I send the bots to pages about oranges. They want me to strip all the junk code out of the page the spider visits. Guess what? The text on both pages will be the same. Deceptive? Hardly. Effective? Yes. Do the engines think it is okay? No. They call that cloaking.
>>Google does take the lead on this, which is why they happen to have the best results
Again, What? Google has the best results because they say "don't cloak"? What does that statement have to do with the quality of the results? All the engines could add that little snippet of text tomorrow and a year later the quality of their SERPS would still be the same, unless those engines change the quality of their algorithm.
I think you're still operating under the assumption that people cloak a site to target cotton candy for the bots then feed the surfers a page about phentermine. Doesn't happen that way unless the person cloaking the site is a complete dolt.
Cloaking isn't a magic bullet. The optimization has to be as good or better as that of pages that aren't cloaked. Cloaked pages are typically focused and relevant. If the cloaked page is created by an SEO they are generally optimized quite well but cloaking just overcomes the SE's inability to handle certain technologies. Cloaking is also used frequently to overcome poor design considerations. What cloakers aren't doing is feeding pages about entirely different subject matter to the bots and the end-user.
All you've done is make it seem as if UA delivery is fine while IP delivery is taboo. Some poor webmaster out there will now cloak using UA delivery and get bounced because Google doesn't like UA delivery either.
|All you've done is make it seem as if UA delivery is fine while IP delivery is taboo. Some poor webmaster out there will now cloak using UA delivery and get bounced because Google doesn't like UA delivery either. |
They don't? Says who? I haven't ever seen Google say they don't like that. Why would they?
|Jill, all you ended up doing was agreeing with everyone that "said doesn't matter what you think, it is the engine's definition that we have to live with. |
And what's wrong with agreeing on that? I'm not here to disagree with anyone. What the search engines say IS what it's all about. In fact, I'm hoping to ask the search engines themselves to see what they will say. If they don't agree, then I will happily go along with their definition. The engines are what matters. All I want to do is get a definition. Even if it's not the one I (and Alan) started with.
All of us can argue about the definition until we're blue in the face, but let's see what the engines say when faced with the question. If I give them a definition and say, do you agree, yes or no, then we will have our answer. I am not certain what they'll say, but I'm interested in hearing it.
>>They don't? Says who
The people that were getting bounced out of Google's index seem to think Google doesn't like UA delivery. The fact that people utilizing UA delivery were getting bounced is circumstantial evidence that Google doesn't like UA delivery.
The reason people combine UA delivery with IP delivery is to make the delivery process harder to detect.
>>And what's wrong with agreeing on that
Well, nothing, except you started by creating your own definition of what cloaking is and you still seem unclear about why the engines might have a problem with UA delivery.
We've been over the issues about the engines not having the resources to check intent and Google had indeed removed sites from the index that were using UA delivery.
>what engines say
Just two remarks:
- engines are very important players these days in the web business. Nevertheless they are just one of the involved parties. To voluntarily hand over all power of definition to them is not neccessary, and somewhat naive.
Search engines per definition have the job of indexing the web and making it searchable. It's not their job to define how the web should be build and run.
Infact search engines have all the time been following the evolution of the web. New techniques and technologies pop up - search engines eventually follow taking them into account.
Google and Fast are currently those engines at least trying to catch up.
- What search engines say doesn't matter so much as what search engines do.
It's very obvious to every informed webpro that especially in Google's case a lot of what they say is rethoric. Quite understandable, of course. It's a means to an end.
So ya, it is important what the engines say, but it doesn't relieve anybody from thinking and making their own choices.
|We've been over the issues about the engines not having the resources to check intent and Google had indeed removed sites from the index that were using UA delivery. |
I don't pretend to know all the technical differences between UA delivery and IP cloaking. However, is it fair to say that Google boots the UA delivery pages mistakenly? Can UA delivery be used to "spam" them, I guess is what I'm asking.
Rather than say that UA delivery is cloaking, if these pages are getting mistakenly booted, you are correct that people should be warned about using these techniques. Not because they are spam (and possibly not even cloaking) but because they can be mistakenly booted. Is there any way to prevent those pages from being mistakenly booted? Or do we simply have to wait for the search engines to become more sophisticated.
Because you're right, I wouldn't want people to think it's fine to use those methods, and that the search engines are fine with it, if that's not the case. But the engines don't think it's "spam," right? It's just getting caught up in spam "filters" because the methods are similar to "spam"?
I'm learning more about this as we go along, so forgive me if my questions sound simplistic to you.
Hmmm.. I fail to see the grey area here:
Cloaking: Devivering diffent content to SE's than Users.
SE's dislike cloaking and can certainly not distinguish between 'white' cloaking and 'black' cloaking. It needs a hand check and there just isn't the resources for that.
Where's the confusion? - If you serve different pages to bots you run a risk.
Black and white - Easy peasey - No issue
|Cloaking: Delivering different content to SE's than Users. |
Agreed. So read the first post in this thread again. Most of what is discussed there has nothing to do with search engines. So it cannot be cloaking, IMO.
Cloaking involves deliberately hiding from SEs the content that users see. This means the SEs cannot factor that content into their ranking algorithms. It should be clear why an engine with any respect for its algorithms would have a problem with this, no matter how closely the content that is delivered is related to the content that is cloaked. That is why some engines say "Don't cloak".
Some quotes from the SE webmaster and spam pages ...
Google Quality Guidelines [google.com]: Don't employ cloaking or sneaky redirects.
Google FAQ : "What is cloaking?" [google.com] : The term "cloaking" is used to describe a website that returns altered webpages to search engines crawling the site. In other words, the webserver is programmed to return different content to Google than it returns to regular users, usually in an attempt to distort search engine rankings. This can mislead users about what they'll find when they click on a search result. To preserve the accuracy and quality of our search results, Google may permanently ban from our index any sites or site authors that engage in cloaking to distort their search rankings.
Inktomi Content Guidelines [inktomi.com] : [Unwanted Pages include] Pages that give the search engine a different page than the public sees (cloaking)
Inktomi Spam FAQ [inktomi.com] : Q: Is cloaking permitted?
Inktomi Spam FAQ [inktomi.com] : Q: What if I present one page to Internet Explorer users, and a different page to Netscape users?
A: That's fine. If the purpose is to serve alternate pages to different human users, based on locality, browser, machine type etc., we do not consider that cloaking.
>>Can UA delivery be used to "spam" them, I guess is what I'm asking
UA delivery can and often is used to cloak a site and Google considers this a violation of their TOS.
>>if these pages are getting mistakenly booted
It's not that they are being mistaken, it is that Google wasn't making the distinction between UA delivery and what Google sees as any other form of cloaking. If you use UA delivery Google can and often has deemed that a violation.
>>Can UA delivery be used to "spam" them, I guess is what I'm asking
Yes. UA delivery can be used to spam them, serve different content, etc,
just not as effectively as a combination of UA and IP delivery.
The majority of people getting booted for UA delivery are getting tossed due to hand checks. Here's where the resource factor comes into play. If a person points out that a different UA string serves up different content for say, just the index page, and a hand check confirms this those Google reps aren't going to scour the entire site to determine intent or to see how many pages are using UA delivery. They're going to flag it and/or dump it.
Here's the crux of the matter:
Can UA delivery be used to "spam" them, I guess is what I'm asking
The answer is yes and I think everyone is in agreement that the engines simply don't have the resources to check for the intent of the UA delivery on millions of sites.
So if you want to steer clear of any risk of getting a site tossed for what most of us call cloaking, then stay away from UA and IP delivery. Google didn't make a distinction between UA and IP delivery in regard to cloaking. Read into that what you will.
We're also left with what the engines say versus what they actually mean. I don't look for any engine to clear things up for SEOs, obfuscation better serves their purpose, or at least the people running the show seems to think so. ;)
|Google didn't make a distinction between UA and IP delivery in regard to cloaking. Read into that what you will. |
FWIW, I read into it that neither technology is cloaking, but that either or both technologies can be used for cloaking. Google cares whether you cloak or not. That's what they wrote about.
|We're also left with what the engines say versus what they actually mean. |
I thought those links I posted above were pretty clear. :)
|FWIW, I read into it that neither technology is cloaking, but that either or both technologies can be used for cloaking. Google cares whether you cloak or not. That's what they wrote about. |
Meaning that many forms of UA would not be flagged, nor booted? (At least not automatically?)
You can do anything you want with your site. Google, Altavista, Inktomi, Alltheweb - they are about indexing the web.
'Cloaking' is about presenting your stuff how you want, to whom you want. If you are interested in getting relevant traffic, odds are, you might look into doing anything you could to acquire as much targetted, relevant traffic as possible.
If you have the kind of stuff all the people that are using Google, Altavista, MSN, AOL, Alltheweb, Lycos, etc are trying to find, it only makes sense to try to come up 1st. So, that way - you get people that are interested in what you have to say to your site.
Everybody wins. You do, cause you got people to your site - which is what you were trying to do.
The surfer wins, because they found *exactly* what they were searching for.
The search engine wins, because the user will remember that they found something useful when searching, and more than likely, will come back to search for something else. That is what a search engine wants -> people to use it, and use it again to find stuff.
What does it matter if the way I got my site indexed, or the format of my site, is dependent on what person or program is looking at the site?
A question of, "if a tree falls in the forest, and nobody hears it, does it still make noise"?
We can discuss that all we want, but in the end, as long as the 3 parties involved get what they want out of the relationship they all have -> everybody, most importantly, the internet wins.
I was reading an article the other day that said, internet time was coming at the expense of television time.
To a guy who doesn't watch much TV, but spends a LOT of time on the internet, that is huge. And you know what?
I believe it's all of us making that happen - the saerch engines, indexing the stuff we as webmasters, publishers, and SEO make - and the user, going to the search engine, and finding the stuff that we made.
It's a lucrative proposition, and if we are all only trying to do what we want - everybody wins.
Who cares how we arrive there? The important thing is to get there - lets not debate on whether or not it makes sense to program one way, or another, use one browser over another, or serve one type of page, or another.
It's not about that. If the user is happy with my cloaked, IP and User agent delivered site, and it sits at the top of Google for every keyword in the book, and people *love* my site, nobody, even Google, and especially not the user, will care how my site got to the top.
|It's not about that. If the user is happy with my cloaked, IP and User agent delivered site, and it sits at the top of Google for every keyword in the book, and people *love* my site, nobody, even Google, and especially not the user, will care how my site got to the top. |
That's all well and good, but it matters because people want to know what's acceptable to the search engines and what's not. At least the people that I come into virtual contact with every day.
None of the search engines have ever "spelled" out in "black and white" what is 100% allowable, and what is 100% not allowed.
They have always left a huge "gray" area, where we as individuals are left to interpret which side of the line our pages are on.
Guess you are out of a job... :) Kidding aside, it's nowhere as difficult today as it was during the "Gold Rush" days back in 98-99.
Google is the only "completely" free spidering engine. Everything else is "Pay For Inclusion".
Google has said they don't like cloaking. I don't know why this is confusing to you? Alan pointed out the pages and FAQ. (Though they give a bit different definition than Alan does :)
So, tell your users not to cloak, and they might become visible after a dozen cycles, changing religion a couple of times, and begging for a year's worth of links... :)
|So, tell your users not to cloak,... |
But dontcha see? That's what the problem is. I to tell them not to cloak...but they (and I and apparently the engines?) are not exactly sure what it means.
I'm not gonna tell them not to do geo-targeting...as that's not cloaking. I'm not gonna tell them NOT to show IE one thing and Netscape another. And, I'm not gonna tell them not to use trusted feed. But many of you insist that stuff all falls under "cloaking."
So we've come full circle and we're back at square one!
>In other words, the webserver is programmed to return different content
>to Google than it returns to regular users,
So all the Flash sites have to do to cloak is:
A) Language delivering. Just detect the .coms and connections from .coms as english.
B) Agent delivery. Just give them the text version of the page - which they are.
Do those two things, and you are good-to-go with cloaking. English visitors using Lynx will get the same page and you get to delivery text based content to search engines for indexing. eg: cloaking in the sunshine.
>The "I know cloaked spam when I see it"
>argument doesn't really hold water when it's not blatant.
That's the point, there is so little of this out there now, that isn't even worth arguing about. Taking care of problem results is the search engines cost of doing business. From everything I've seen in the last few years, the problem has all but taken care of itself. We've not had one single story that I know of in the last two years where people were taken away to some inappropriate site after a search. If it has happened, it hasn't lived long in the search engines. That's as it should be.
The closest we've come is with the GW search and the blogs playing with link pop. Neither of those have anything to do with cloaking and point to some logic problems in search engines.
Now the problem is that people are so mental about the whole topic, that innocent people are being classified as doing something wrong.
There was a site that was drug through the forums just last week and pointed out as "gaming" the search engines. They don't even have anyone on staff that understands search engines, let alone to be gaming them. All they were doing was protecting their IP by registering every possible domain and typo surrounding their trademark. They got turned in as 'domain spammers'.
The root of all of these problems is the same today as it was in 1994, the search engine model is fundamentaly flawed. They have based their entire service on the repackaging of other peoples property. Search engines are nothing more than powerful value added resellers.
The most appropriate analogy I have been able to find is that of the stock broker. Stock brokers set inbetween sellers & buyers and do nothing but hook them up and take a fee for doing it. Often they do some sort of value added package such as day trading interfaces, or various stock monitoring services. The same is true of search engines in that they do nothing but set inbetween surfers & sites, and do nothing but hook them up and take something out (often advertising or listing fees) for doing it.
They are all forms of cloaking webwhiz. What is and isn't acceptable will vary from one SE to another. Where exactly is the line that if you cross over will get your sites removed. There is no one single answer to that question. Can a large company employ forms of cloaking that will get your site removed if you did the same? Yes they can. The playing field is not even, many times it(what is and not acceptable) is related to the size of your advertising budget and this doesn't apply only to cloaking.
I have come to view the public relations departments of all the SEs as "spam departments" or spin departments.
|Yes. UA delivery can be used to spam them, serve different content, etc, |
just not as effectively as a combination of UA and IP delivery.
I'm a little confused here DG. Why would a combination be more effective? - Are you saying a combination is more effective than just IP delivery?
>>Why would a combination be more effective?
It is extremely easy to spoof a UA, spoofing an IP isn't as easy...
The possibilities for delivering targeted content using a combination of both are endless and offer much more control.
Would you guys agree that there's no need for IP Cloaking (or cloaking as Alan defined it) other than for SEO purposes?
|Would you guys agree that there's no need for IP Cloaking (or cloaking as Alan defined it) other than for SEO purposes? |
No. Bad bots that drain bandwidth are a problem. Using this technique can save money in hosting fees. If I use IP and UA redirects to show different pages to aggressive bots then I can save bandwidth.
korkus, can you explain what you mean a bit more? I don't understand the technique you're talking about.
Say a new search engine writes a bot that doesn't respect robots.txt, and it is abusive on your server as the big guys. You see no reason to have it indexing your site and it is tieing up cpu time and bandwidth. How do you get rid of it. Test for IP and UA and every hit sends it to a page with no links to follow telling the bot they are not allowed. There are many bots out there like harvesters that don't respect robots.txt and instead of having them leach bandwidth you test and trap.
Under the definition givin I have cloaked an SE bot. I targeted its UA and IP and sent it to a page that normal users will never see. I don't consider this an SEO issue. There is also no ethical issue here. If I want bots to view my property, it is my decision alone.
For those who aren't aware, I've been involved in a discussion on the issue of cloaking at another forum site. I felt I had to jump in there after the owner of that site said, "If Danny Sullivan and others don't know what the difference is, then they should be alerted to [Alan's] article to learn what 'is' the difference."
He assumed that Alan's article was the ultimate definition of cloaking. I had a different view. I like Alan, and I agree with him on many issues. However, I think his article is causing more confusion than he intends for it to clarify. The two of us have been having quite a friendly but intense email debate on this, over the past few days.
My definition of cloaking, revised a bit from that earlier post, goes something like this:
"Cloaking is getting a search engine to record content for a URL that is different than what a searcher will ultimately see, often intentionally."
Unlike Alan, I am specifically NOT concerned about the technicalities of how the cloaking is done. In fact, I think the big mistake these days is trying to define cloaking in technical terms. I don't care if you agent detect, IP detect, "poor man" cloak via a frameset or use an XML feed. If the searcher sees something different than the content of the page recorded by the index, that's cloaking.
In fact, Brett points out well many the types of "cloaking" that some argue exist. Such arguments infuriate others like Alan, who see them as simply a way to justify what they consider "deceptive" behavior with search engines. Both have truth in what they say.
Crucially, I'm not assigning any wrongdoing to cloaking, as Alan does. The fact that content is cloaked does not to me automatically mean that you are spamming. I think that's an important distinction that has to be made, if you want to help people understand the issue, as Jill, Alan and Brett all want.
Instead, I'd rather see the discussion focused on "approved" and "unapproved" cloaking. For example, I consider XML feeds a form of approved cloaking. I don't think most people use them because they say, "Great, here's a way to cloak" but rather because they offer a good way to feed in a product database. Nevertheless, to me they are a form of cloaking -- but one that's approved by the search engines that allow them.
Given this, why say a text-only page would be bad? To me, if the content is substantially like the page the user sees, there shouldn't be a problem. I, for one, would not consider someone spamming for doing this.
Nevertheless, as Brett pointed out, he knows someone that got pinged for spamming, because of it. That underscores the fact that the cloaking, while not deceptive in spirit, remained unapproved and left the site vulnerable to getting a spam penalty. If the site is eventually reviewed by the unnamed search engine, I wouldn't be surprised if the penalty was removed. They'd essentially give approval for a form of cloaking.
How about the case with Google, which hates cloaking? Is it cloaking when it does language or country detection and if a search engine spider is influenced by this when recording Google's content? Technically, sure. Was it an intentional act? No, but that doesn't matter. Google could find itself banned by search engines against cloaking.
Who are these search engines? Interestingly, AltaVista and AllTheWeb.com have no terms on their sites against cloaking that I found, when reviewing things after Alan's article was posted last week. Google, Inktomi and Teoma do have terms against it, though none of them define exactly what type of cloaking is bad. In other words, they don't say, "IP-based cloaking is bad; language detection is OK." They simply say it's showing the search engine content different than what a user sees.
So for these three, cloaking of any type could get you in trouble. I argue that Inktomi and Teoma both have a form of "approved" cloaking through XML feeds that go against their stated rules. I further strongly suspect that both have ordinary HTML content that's allowed to be cloaked via paid inclusion. Given this, I'd certainly like to see them say that "unapproved" cloaking is not allowed, just to clarify things.
As for Google, things are pretty clear there. Cloak -- show them content different than what a users sees -- and you could get into trouble. I would say the more technically advanced and intentionally, the more likely you are to be targeted by Google.
So ultimately, to Jill's confused readers, I'd say this:
"Cloaking is getting a search engine to record content for a URL that is different than what a searcher will ultimately see, often intentionally. It can be done in many technical ways. Several search engines have explicit bans against unapproved cloaking, of which Google is the most notable one. Some people cloak without approval and never have problems. Some even may cloak accidentally. However, if you cloak intentionally without approval -- and if you deliver content to a search engine that is substantially different from what a search engine records -- then you stand a much larger chance of being penalized by search engines with penalties against unapproved cloaking. If in doubt, as a search engine if they have a problem with what you intend to do."
I'd like to say all the search engines will promptly respond if asked, but they probably won't. Still, if you've asked and ended up in trouble, then you can at least show you tried to get clarification. Moreover, I doubt most of Jill's readers will ever need to have prove they asked to engage in cloaking. Few of them are big, industrial cloakers likely to get in trouble. In contrast, Redzone has come "out of the closet" as someone who intentionally cloaks, feels he has a good reason in some cases to do it and seems aware of the risks involved.
I'll conclude that I don't expect everyone to agree that I've provided the ultimate definition of cloaking any more than they might for Alan. It is simply what I hope will help some people make correct decisions, whatever they ultimately decide. Also, Alan will probably jump back in to dissect my arguments. That's fine -- I'm not going to respond back simply because I've already done that once in another forum :)
|"Cloaking is getting a search engine to record content for a URL that is different than what a searcher will ultimately see, often intentionally." |
IMO this is a better definition and not so exclusionary of new forms of cloaking that we may have not even seen yet.
| This 129 message thread spans 5 pages: < < 129 ( 1 2  4 5 ) > > |