Forum Moderators: open
How i see it is you are presenting optimized content to the search engines and a diff page to the customer?
So fare with the information i have gathered i have been trying to do it by creating a robot.txt file that allows specific spiders access to optimized pages and disallows access to any other area of the site.
I believe that the optimized pages need the meta tags no index and no archive on them? Also if you are disallowing access to say for instance your index page in preference to your optimized page how do the spiders see the relationship between the optimized page and your index page?
Could anybody please fill in the grey areas as i seem to be reaching a lot of dead ends.
Cheers
Mrgreedy
Cloaking really has nothing to do with the robots.txt. Cloaking, for SEO purposes, is showing the same web page to different visitors, and the visitors see different content.
This requires some kind of scripting language to identify visitors and decide which content they get, as well as to selectively serve the content.
Most cloaking is done using either Perl or PHP. A cloaked web page will be a Perl or PHP script which identifies a visitor as either a search engine spider or a non-search engine spider (a human). If the visitor is a search engine spider, then the script shows the visitor the optimized web page. If the visitor is a non-search engine spider, then the script shows the visitor some other page, such as a home page.
You can get some more details on the process here: [webmasterworld.com...]
A new question like this should normally be posted in a thread of its own, so this thread can stay on-topic, but I'll answer your question quickly.
If the cloaking is done right, there is no fool-proof way of detecting it. There will be nothing in the HTML source that gives it away. You could try a couple of things to try detecting cloaking, though:
1) Check the Google cache. If the cache is different from the linked page, then there is a good chance the page is cloaked if it is at the top of the serps. If there is no cache, then the webmaster has used the NOARCHIVE meta tag to disallow caching. Some people think this is an indication of cloaking, but it is used for a number of unrelated purposes, so it isn't a sure-fire method of detection.
2) You can use a browser like Firefox to change your user agent to Googlebot's and access the possibly cloaked page. If the page is different, then it is probably cloaked.
3) Try accessing the page through a search engine proxy, such as a translation tool or WAP proxy. If the page changes, then it may be cloaked.
All of the above methods are usually accounted for in commercial cloaking software.