|Robots text file or noindex,nofollow|
| 1:42 pm on Jul 29, 2004 (gmt 0)|
Just curious which one is more effective and fastest: a robots text file not allowing search engines to crawl my site or a noindex,nofollow meta tag?
Also, if a robots text file is better and i put it in my server root directory, does it still work if i have a index.html file in that directory?
And final, i have some links on Yahoo i want to get off because they are duplicates or similar as they are listed on yahoo.com and yahoo.ca. It's a real pain in the butt, because when i try to update my web sites or Yahoo updates it's links all it does is switch one link(that is similar or the same) from yahoo.com to yahoo.ca and vice versa if it is the same or close to being the same and then wont update with the new info! It just keeps switching links between yahoo.com and yahoo.ca instead of updating with the new info(title, description etc..)
| 3:13 pm on Jul 29, 2004 (gmt 0)|
the first thing to remember is that robots.txt only works for those 'bots that request it... then it only works for those that respect it...
as for which one is fastest, i don't know that that comes into play, really... robots.txt should take effect immediately but some bots don't use it during the crawl... they pull it to put in the master database and then the database assigns the bots to pull what is allowed...
[aside: grub is one such bot and it should be noted that grub is still in development and undergoing many changes and adjustments... one way of looking at grub is that the database is the bot and the spiders visiting sites are only retrievers that gather what the database tells them to gather... actually, that's exactly how to view grub... it is a lot different than other bots and search engines... but back to your questions...]
on the question of most effective, that would likely be the metatag in each html document but again, that depends on if the bot recognises it... some do, some don't... i don't have a list of either, though... i use a combination of both on my site(s)...
robots.txt is required to be in the root directory if it exists... it is not looked for in any other directory... robots.txt is requested by name directly no matter what else is in the root directory...
i don't know how to help you with the yahoo problem, sorry...
| 2:01 pm on Jul 30, 2004 (gmt 0)|
Ok thanks for the help. Just one more question though, does it matter if you leave a space after the comma between noindex and nofollow("noindex, nofollow" or should it be "noindex,nofollow"). I have seen it both ways, does it matter which way you type it?
| 4:18 am on Jul 31, 2004 (gmt 0)|
anyone know if it matters if there is a space between the comma and the word follow(index, follow)or should it be(index,follow) with no space...does it matter?