Welcome to WebmasterWorld Guest from 126.96.36.199
Forum Moderators: open
As Air mentioned [webmasterworld.com], the current state of cloaking is quite complex. I think it is worth a recap to see where cloaking is at today. Although this is a new article, it is somewhat of an update to an older thread we started last spring.
Cloaking usage is exploding on the web right now. There are many different types of cloaking and a diverse set of purposes for usage. The mass majority of the top 1000 sites (98%?) are cloaking in one form or another. You'd be hard pressed to find a top site that isn't cloaking today (including this one).
Forms of Cloaking
There are four or five related forms of cloaking. All rely on serving customized content and formats based on each unique user. They vary according to what criteria they use to generate customized content for any users.
As you know, any site can identify several things about a user when they visit. Those range from, ISP (ip address), Agent (browser), Referring url, to the more esoteric http headers such as http accept headers which can identify even more about the agent used.
Agent: sites detect what browser is being used and custom deliver pages for the browser. Agent cloaking is used by most major sites including all the top search engines.
IP Address: sites detect what ip address (ISP) a user is using to custom feed a page. IP delivery can be done for several reasons:
- language targeting. ex: Users from Spain a page in Spanish.
- broad band targeting. ex: @home users a page optimized for high bandwidth cable connections.
- geo advertisement targeting. ex: Giving users in England an ad for the local London pub.
- content targeting. ex: A news site leading with it's California news for California users, while Texas based users get news for Texans.
Referral Targeting: sites deliver custom content based on referring string. ex: Overriding a linking sites frames (ex: targeting Ask Jeeves referrals for frame overriding), giving specific content. ex: giving the CNN referral a page with links to more news stories, while giving the Yahoo referral a page with links to general topics.
Content Protection: a site generates a page for one user and not for another. ex: Giving Altavista one version of a page while giving users another.
The last form of cloaking is what I call Wild Card Cloaking. It covers a broad range of cloaking styles and purposes that are combinations of the above. examples:
- Altavista's Trusted Feed program and Inktomi's Index Connect are examples of search engine cloaking programs.
- Custom content news feeds, where sites deliver news spiders headlines instead of raw content.
- RSS headline feeds are often cloaked for specific sites and agents.
There is an entire subculture of diverse wild card cloaking that is exploding in usage right now.
In this forum, we tend to focus only on cloaking for the sake of search engine optimization. As you can see from the above, the scope of cloaking is exploding right now in it's many forms. SEO cloaking is just one small part of a much larger picture.
>>Content Protection: a site generates a page for one user and not for another. ex: Giving Altavista one version of a page while giving users another.
I think a variation of content protection we will begin to see much more of is Evil Bot Control.
The number of worthless bots crawling the web chewing up bandwidth has grown dramatically. It has now become much easier to track and monitor the bots you do want crawling your site than it is to track and ban the ones you don't.
Ip delivery allows you to easily let the good bots in while at the same time preventing the bad bots from crawling by serving them a completely blank page with no links to crawl.
It also allows you to force the good bots to honor the intent of the Robots Exclusion Protocal.
For example, Google has the annoying habit of making links to excluded pages available to users. All that is required is links pointing to the pages you don't want GoogleBot to index to be included on pages it has indexed. The only way to prevent this from happening is to serve Google pages that only contain links to the pages you want indexed.
I just wanted your input on an issue (to my understanding of course) on IP based cloaking systems. We already know how unreliable a user-agent based cloaking system is and for most of us, it's probably why we use an IP based one. So my question is: what happens now that IP spoofing is or will become more and more available? (since the venue of win2K, but even more with winXP)
I mean what is one suppose to do against such tactics to protect his hard work from snoopers??
Anyway, I hope I have it all wrong or else, we'll all need to put on our thinking cap and get to work!
Your help is as always really appreciated on this one.