Welcome to WebmasterWorld Guest from

Forum Moderators: goodroi

Message Too Old, No Replies

Blocking all URLS to a website using the robots.txt file?

2:01 am on Jan 4, 2006 (gmt 0)

New User

10+ Year Member

joined:Oct 17, 2005
votes: 0

I run a real estate website and we link out to a MLS system on almost every page of the website. Is there a way to use the robots.txt file to block spiders from spidering these links? Each link is a different url but from the same domain.



10:05 am on Jan 6, 2006 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:May 31, 2005
votes: 0

No, not via robots.txt

However you can do it via either
1) A the robots meta tag
<meta name="robots" content="index,nofollow">
This tells a bot to index the page, but not any links it find in that page.
Advantage: you only have to do it once per page in the header.
Disadvantage: This will be for all links in the page, it is not selective to outbound links, but will include internal ones.
See [robotstxt.org...] for more details
2) The attribute nofollow
e.g <a href="http://www.example.com" rel="nofollow">
See [linktutorial.com...]
Advantage: It allows you to selectively tell a bot not to follow certain links
Disadvantage: You have to do it for each link.

Depending on what you have and are trying to achieve will determine which one you should use.