|how to AVOID pages being indexed|
[spun off from another thread]
this thread was copied from another in the Google search engine forum, but it's a topic that might stand on its own as a general webmaster issue. I hope I've been able to salvage the pertinent posts here. -RCJ
rencke and rc,
Do you have any thoughts about keeping spiders away from the content pages and only allowing them to feed on the NOFRAMES content of the frameset?
What tedster asked - I need that, too (I have pages that need to be hidden).
Is there any way to "hide" the contents inside the noframes tag? My competitors would sure "view source" and see the noframes keyword coding. I'm sure I have seen this done somewhere, I just can't remember how it was done.
>Do you have any thoughts about keeping spiders away from the content pages and only allowing them to feed on the NOFRAMES content of the frameset?
Well, the site I'm referencing had no concerns w/ hiding the content, and my philosophy re spidering is generally more is better. Since the content was dynamically generated (using a query "?"), I assumed that it was not going to be spidered anyway.
No way that I know of. There was a thread yesterday that I believe discussed some tricks to make it less visible, but it was still there in the source.
>>Do you have any thoughts about keeping spiders away from the content pages and only allowing them to feed on the NOFRAMES content of the frameset?
If you have nothing to hide, why put all your eggs into the same basket? The more pages (with different titles and descriptions), the better chance that one of them may rank high in a search reply and bring people into the site.
If, on the other hand, you do have something to hide - well, sorry I have no other idea than the meta robots tag - but I am told that not all engines respect that one.
This is good to know. The pages in question for me involve some "helps" pages at a domain site (with more to go up after editing) which were also put up at a helps site done in conjunction with volunteer work - never intended to be spidered or listed, never submitted. I didn't want them spidered or indexed - they were for "community" use only, to help members and avoid a lot of duplicate posting of information and answers.
BUT - other members of the community linked to the pages on message boards, which I wasn't expecting and didn't even think of, and indeed, the boards were spidered, and therefore those pages. I just checked Google, and every page on my "community site" is indexed. The issue is having duplicate content up.
I'll just get them 404'd. How interesting - how to avoid listings instead of how to get them.
So, in this new thread on the subject, the question is:
...and not just in noframes.
|how to avoid listings instead of how to get them |
I've had fair success in stopping email harvesters by using js to write the mailto tags. It seems using document.write to code links to pages could provide a similar "stop" for spiders.
Purposely loading the page with words spiders view negatively, such as "links, resource, directory" might also add some defense without getting you marked as a spammer.
As for boards linking to the help pages. How about a js that performs a redirect (to a 404, perhaps) unless the referrer matches an approved page?