Forum Moderators: goodroi

Message Too Old, No Replies

Disallow keyword Ignored?

Who guarantees me that disallow keyword isn't ignored

         

bkausbk

5:49 pm on Apr 20, 2005 (gmt 0)

10+ Year Member



Who guarantees me that search engines don't ignore "Disallow" keywords but using "Disallowed" paths explicitly to grab as many information as possible?

bkausbk

Lord Majestic

6:10 pm on Apr 20, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Who guarantees me

Nobody.

bkausbk

9:52 pm on Apr 20, 2005 (gmt 0)

10+ Year Member



This means, one should use "Disallow" keyword as less as possible and don't specify files which may contain additional references to disallowed files, don't you also think so?
I've idea. I'll create a domain and a web site with 1 html file (no index.html but something that will not be Searched for by spiders etc. usually) which shouldn't be followed.
I'll create one entry in robots.txt to explicitly disallow this file. Then I'll create an index.html and I'll submit this link to various search engines.
After 1-2 months I'll search for this site in various search engines just to find out which search enigine can be trusted. Yes, nice idea ;)

bkausbk

Lord Majestic

10:09 pm on Apr 20, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



No - it means that Disallow is pretty much the best you can get, but nobody gives any guarantees: robots.txt is a voluntary convention that ethical search engines follow.

Reid

9:02 am on Apr 24, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Disallow is only a tool for controlling goodbots. You are not blocking them you are just telling them what not to crawl.
For badbots you need to use other tools like .htaccess if you are on apache.