why would you use the ROBOT nofollow command

Forum Moderators: open

Message Too Old, No Replies

why would you use the ROBOT nofollow command

jlyons1234

1:34 pm on Apr 4, 2004 (gmt 0)

why would you use the ROBOT nofollow command? doesnt it just let google see your index page and none of youe links? isnt the point to try to add up as many keywords as possable?

Jay

atlrus

4:33 pm on Apr 4, 2004 (gmt 0)

yDo you mean the meta tag "robots" ot the "robots.txt"?

Miop

5:43 pm on Apr 4, 2004 (gmt 0)

I am just thinking of using it on my web site to block the robots from the customer sign up/log in pages and the shopping basket page, since Google seems to be indexing anything and everything (dynamic stie).

Powdork

6:28 pm on Apr 4, 2004 (gmt 0)

It would be better to put noindex on the destination page. this will tell Google not to index that page. If you put nofollow on the page linking to it, Google may still read the link and index the url of the page. Additionally, you would be blocking Google from following any of the other links on the page.
<meta name="robots" content="noindex, nofollow"> on the shopping basket and/or sign in pages will do the trick. Or you can use your robots.txt file but that still will only keep the bot from visiting the page, it won't keep it from indexing the url.

GuinnessGuy

6:53 pm on Apr 4, 2004 (gmt 0)

Hi Powerdork,

Will what you suggested prevent PR leakage as well?

GuinnessGuy

Powdork

7:32 pm on Apr 4, 2004 (gmt 0)

I don't believe in PR leakage

atlrus

12:20 am on Apr 5, 2004 (gmt 0)

Use robots.txt to stop the bots visiting - if they don't visit the page - they can't index it.

g1smd

12:40 am on Apr 5, 2004 (gmt 0)

>> Use robots.txt to stop the bots visiting - if they don't visit the page - they can't index it. <<

Not true.

If Google merely sees the page mentioned in a link from some other page, then it will still list it in the SERPs but as URL only, without a title or description.

The only safe way is to use the noindex on the page itself.

GoogleGuy

4:37 am on Apr 5, 2004 (gmt 0)

GuinnessGuy, I love the nickname. :) I would use the nofollow meta tag when you really don't want spiders to follow your links. For example, if you have a web application where pages are generated on the fly and they take a while to compute, you might not want spiders following lots of your links, just people. So robots.txt works well when you know your site architecture well in advance, but if you're generating pages on the fly, it's always safe to put nofollow (and possibly noindex) in the meta tags.

g1smd

7:15 pm on Apr 5, 2004 (gmt 0)

What if you put nofollow on your own pages that point to pages that you do not want indexed, but then someone else on some other site put some links in to that page you didn't want indexed?

Then it will be indexed, no?

That is why I suggest a noindex on the page that you do not want indexed. That will stop it being indexed irrespective of who or what links to it.

Or did I miss something?

GoogleGuy

7:42 pm on Apr 5, 2004 (gmt 0)

g1smd, I agree. If you don't want a page indexed or you want to be safe, just use noindex on the page that shouldn't show up in search results. Relying on nofollow wouldn't work if you forgot to add nofollow to a few pages, or if someone else pointed to that page.

If you know you don't want a section of your site indexed, I'd use both noindex, plus nofollow to make sure that outgoing links weren't crawled either.

Ledfish

1:48 pm on Apr 6, 2004 (gmt 0)

Adding to what GG said, if you can isolate it to a whole directory, then using the robots meta and the robots.txt file method is a good idea.

Robots.txt is best used for entire directories, not for specific files and for excluding specific bots entirely.

DoppyNL

1:56 pm on Apr 6, 2004 (gmt 0)

If you're concerned that your server gets to much hits then using the meta-tags won't do the complete job; as the pages will still be requested occasionally.
It's then better to use a robots.txt, since (good) crawlers then only request the robots.txt

Allthough I know GoogleGuy once commented that GoogleBot will back off (ie: do less requests or come back later) when the response time of your server is slow.
Wich would result in Google crawling your site in the off-hours :) nice feature!

@GoogleGuy:
Is there a size-limit for a robots.txt?
When does a crawler (or the googlebot) choke on a robots.txt because of its size?

Brett_Tabke

4:11 pm on Apr 7, 2004 (gmt 0)

I have tested "no follow" extensively - Google does not funtionally obey it. They will find the links from else where and follow those. It's usage is moot.