Welcome to WebmasterWorld Guest from 220.127.116.11
Forum Moderators: goodroi
we have a travel site which has about a thousand pages, and has been online in the search engines for about 5 years. it uses a CMS, one that outpts in html.
we have never done any real seo, apart from changing a few tags around since last year. however it ranks ok for the keywords we have chosen.
we are going to do a total re-design. the new design proposed has a lot of rich media content, is still using the cont tent management system, and will be very graphic oriented.
if we made a duplicate version for accessibility reasons , in plain html, exactly like the bbc have done . (u can see this under 'text only' version on the bbc main page - bbc.co.uk/home/today/textonly.shtml) and then put a robot exclusion tags on every page of the rich media site. could this work?
so in effect we have a index splash page. 2 entry points on that page. main visitors will go to the colourful graphic pages. that link blocks the search engines. they follow the links to the text only pages. previous link popularity still going to main index page, + new links deeplinking to text only pages.
is this considered a kind of cloaking? or is it just an alternative way of doing things. we would have to have the accessability part anyway for legal reasons as it is a brand site. this way would just be taking advantage of seo at the same time.
we would also need to have the robot exclusion tags because of the duplicate content. this way there would only be 1 set of copy indexed and it would just be the easy to spider text stuff.
any help, info or ideas would be appreciated thanks.
Logically, this should work as you are not attempting to manipulate the SEs with the graphical pages and if blocked, they should not appear to be "duplicates" of the text pages.
However, many pages I have blocked with NOINDEX, NOFOLLOW turn up as URL only listings. In a couple of cases, where they were, in fact, near duplicates of an indexed page, the near duplicate indexed page has been dropped to URL only as well. Not the plan at all! Would like to know why.
Where I have put near duplicate pages in a separate directory (eg landing pages for AdWords), the directory blocked entirely by robots.txt, it seems to have been 99% effective: no sign of these in Google (except where a stray link may have crept in) and no related duplicate penalties.
I'd definetly try to avoid having both pages indexed as this could cause problems with the search engines. I'd put, as already suggested, both versions in separate folders and disallow one of them via the robots.txt-Standard. This makes handling very easy as you don't have to adjust the robots.txt file every time you add/change/remove a file that's in the root directory.