Forum Moderators: open
I have some flash websites and, being flash movies, spiders can't read well their content and follow the links.
I used .htaccess to redirect googlebot (but also textual browsers as links
and linx) to an alternative home page (index_text.php) which contained
the same text and links as the flash animation, but was written in plain
html and was easily indexed by search engines.
This method worked on all my flash sites (hosted on different
providers) till february-march 2006, when googlebot stopped indexing this
textual page... and started to index the common homepage as a normal user...
Any ideas of what happened and how to solve the problem?
This is my .htaccess
RewriteEngine On
RewriteBase /
RewriteCond %{HTTP_USER_AGENT} ^Googlebot [OR]
RewriteCond %{HTTP_USER_AGENT} ^Links [OR]
RewriteCond %{HTTP_USER_AGENT} ^Lynx
RewriteRule ^$ index_text.php
Thanks in advance.
RewriteEngine On
RewriteBase /
RewriteCond %{HTTP_USER_AGENT} ^Googlebot [OR]
I wouldn't touch cloaking with a 10-foot-pole, but common sense tells me nobody would juggle around lists with 10 of 1000s of spider IPs and keep them up-to-the-second if you could fool 4000 phds simply by using "... HTTP_USER_AGENT} ^Googlebot"
I'd say you might need some better cloaking software that keeps up with the SE IPs automatically. PM me if you want my recomendations.
Therefore, your start-anchored regular expressions pattern no longer matches their requests.
You could remove the start-anchor and use:
RewriteCond %{HTTP_USER_AGENT} Googlebot [OR]
RewriteCond %{HTTP_USER_AGENT} ^Mozilla/5\.0\ \(compatible;\ Googlebot/ [OR]
You might also want to make sure you send a 'Vary' header to warn network caches that you are serving user-agent-dependent content:
# Tell caches that page content changes depending on client user-agent
<FilesMatch "\.(html¦php)$">
Header set Vary: "User-Agent"
</FilesMatch>
Jim