Forum Moderators: Robert Charlton & goodroi
Rather than for off-site links, has anyone tried using the 'rel="nofollow"' flag to keep spiders out of their site they don't want indexed?
My site has multiple versions of some pages. The version which should be indexed only points on to pages with extra content. The version which should not be indexed has the content pages as well as pointers to pages where content can be added. This not to be indexed version of the page has a "?locfilter=off" query string on the same base URL. Since the base URL stays the same, robots.txt can't be used here.
I've already put robots noindex headers in the pages, but of course the spider won't know until it visits the page. This also tends to leave empty URL's in the index. What I would like to do is tell the spiders not to follow these links in the first place, which is what I hope the 'rel="nofollow"' tag can accomplish.
Does anybody have some insight into this?
Thanks.
Still, it would be nice to keep the spider from going down that path in the first place.
Use javascript. But keep URL non-crawlable. You can do it the following way:
If the original link was:
<a href="page.php?arg">
change it to:
<a href="page.php" onclick="return section('arg')">
and define section as:
<script type="text/javascript"><!--function section(arg) {
location.href = 'page.php?' + arg;
return false;}