Forum Moderators: buckworks
I use both the META tags and a robots.txt file - the robots.txt specifically excludes the shopping cart page, but all of my META tags (except on the cart page) have "INDEX,FOLLOW" because much of the content on my site is cross-linked and I want to be sure the spiders find everything.
I've never read conclusively which is supposed to take precedence - the robots.txt file or the meta tags - so I'm not convinced that I'm doing it correctly.
In particular, I notice that the pesky Road Runner / ImageScape robot seems to crawl every link it comes across. Maybe I should just ban it completely?
Any other suggestions?
Robots.txt takes precedence. If a page is disallowed in robots.txt, then the spider won't fetch it. If the spider doesn't fetch the page, then it can't read the on page meta robots tag.
I banned Imagescape several months ago. It is among many robots that either don't fetch robots.txt, don't obey it, or don't obey either robots.txt or the on-page meta robots tags. Robots.txt is a request from you to the robots; The good ones will comply, and the bad ones need to be blocked or banned by stronger measures.
If you're interested in shrinking your robots.txt file, you might find this post [webmasterworld.com] interesting.
HTH,
Jim