Forum Moderators: phranque
If a search engine's spider is excluded from the "offending" pages, is that as good as not having the pages at all?
What is the robots.txt exclusion protocol for Yahoo Slurp?
Also what if I am already using "User-agent" for all search engines for some pages for security reasons?
Can I use multiple instances of User-agent: in robots.txt to cover specific search engines and sub-directories in my site?
Example:
User-agent: *
Disallow: /cgi-bin/
User-agent: Slurp
Disallow: /folder/
Thanks
Fred
User-agent: Slurp
Disallow: /folder/
Yes, you can use multiple instances of "User-agent:" in your robots.txt. Your example excludes all robots from your cgi-bin and tells Slurp to not spider "/folder/".
There are some tutorials and a robots.txt validator at Search Engine World [searchengineworld.com].
Yes, you can use multiple instances of "User-agent:" in your robots.txt. Your example excludes all robots from your cgi-bin and tells Slurp to not spider "/folder/".
Span,
Good. Will doing so however protect me from a search engine's penalties for having those particular pages on my site, since their robot has been excluded from those pages? Some sub-directory pages of mine might be a problem for Yahoo, however not Google or MSN.
Also if I want to exclude other pages with slightly different names Ie: index1.html, index2.html, index3.html etc... then could I use the following to cover them all:
User-agent: Slurp
Disallow: /index?.html
Thanks