jatar_k

msg:1529422 | 4:22 pm on Aug 7, 2005 (gmt 0) |
it isn't a supported tag, doesn't mean they can't use it
|
Dijkgraaf

msg:1529423 | 10:24 pm on Aug 7, 2005 (gmt 0) |
Well obviously they can and have used it, however they risk having other bots/spiders getting confused and disregarding all their rules.
|
effisk

msg:1529424 | 1:35 pm on Aug 17, 2005 (gmt 0) |
a bit off-topic, but I see /search listed in google's robots.txt and if you look at these results: [search.msn.com...] you'll see a google search page listed there. which means search.msn has indexed a disallowed url. It's funny to see a google result page listed in a microsoft one :) The msnbot has probably indexed this link found on several other websites...
|
Lord Majestic

msg:1529425 | 1:54 pm on Aug 17, 2005 (gmt 0) |
| you'll see a google search page listed there. |
| Its a link that must have been found on one of the crawled pages elsewhere -- robots.txt only regulates which pages should NOT be retrieved, not which URLs should not ever be used to link to the site.
|
PatrickDeese

msg:1529426 | 1:59 pm on Aug 17, 2005 (gmt 0) |
> which means search.msn has indexed a disallowed url. I think your conclusion is wrong - MSN has not indexed the page, only the URL. Google will also show URL only results, even if they're banned via robots.txt
|
|