|Detecting googlebot and sending it noindex nofollow|
| 10:27 pm on Feb 19, 2008 (gmt 0)|
I wish to ban google from some pages on my site, but its a content driven ban, not a contained withint this folder/ these urls type ban.
eg, If googlebot requests a page and I use code to determine whether the page served is "rich enough" for google to see.
If its not, I serve a no index no follow meta tag.
As the test for richness seems to bne lss for msn and yahoo, i would still apply this test but apply it differently.
Is this ok? Am i cloaking?
| 10:40 pm on Feb 19, 2008 (gmt 0)|
yes you are. If you are showing the same page to humans and it is different to *only* a se bot - then that is the definition of cloaking.
If the content serving is based on:
- time of day
Then that can/could pass as acceptable because you are serving different content to alot of different people.
| 11:17 pm on Feb 19, 2008 (gmt 0)|
ok... but you see what i am trying to do yes?
I might have 100 listing in a directory site.
15 of them are unique to my site and are rich, google worthy content.
the other 85, are not, many sites listing these details.
I of course want people to be able to navigate to all pages and my site is less useful to visitors if I only display unique data... but if my site is to be penalised because of the 85 pages of not rich content... i want search engines not to see them.
Its not an attempt to trick search engines... its just a "hey, is that you google? This page is not up to your stringent standards so please don't visit it or index it.. the page is useful though to my visitors and msn/yahoo like it, so I want them to see it. ok?
with a 100 page site, i could do this with a large robots.txt... i just can't do it using robots.txt on large sites.
| 12:17 am on Feb 20, 2008 (gmt 0)|
>> i just can't do it using robots.txt on large sites
Can you separate the content? You could put those pages in a sub-directory and use a single line in the robots file to exclude it.
| 12:46 am on Feb 20, 2008 (gmt 0)|
Google offers page exclusions (although they don't provide an example of same):
| 12:56 am on Feb 20, 2008 (gmt 0)|
|Its not an attempt to trick search engines... its just a "hey, is that you google? This page is not up to your stringent standards so please don't visit it or index it.. the page is useful though to my visitors and msn/yahoo like it, so I want them to see it. ok? |
So you'd be okay explaining to a Google engineer that part of your website is not up to Google's standards so you decided to hide that from Google. If what you're saying is true, then you don't really have an alternative do you?
| 12:58 am on Feb 20, 2008 (gmt 0)|
Can't you also just do a
<meta name="googlebot" content="noindex,nofollow">
so that google won't crawl your page but everyone else will?
| 3:25 am on Feb 20, 2008 (gmt 0)|
yes thats my answer.
Googles standards of what should be in their index, are not equal to what should be in a website.
My site is about horse trainers. I have details of 12,000 horse trainers. anyone coming to my site, should be able to search and find any horse trainer on it.
But some of my horse trainer listings are no more rich than the yellow pages listing, thus it make no sense for that page to be in google, its a duplicate.
The rich pages where a trainer has filled out lots of specific details, breeds, training techniques, champions theyve trained etc,,, these should be in.
i am fine to allow msn/yahoo to see all pages.
google says don;t show the bare listing date... or thats what they seem to be saying..
but of course these bare listings should stay on the site. Just because they have not been updated to 'better than everyone else' on the ent, does not mean my site visitors don;t want to see them.