Forum Moderators: goodroi
Can I just use:
Disallow: buy.php
Or do I have to create different robots.txt files for the files in different subdirectories?
look at msg #10 in this thread for a classic explanation of robots.txt posted by jdMorgan
there is only one robots.txt located in the root directory.
try
Disallow: /*buy.php
each construct needs to start with the root /
but the wildcard * will include /alldirectories/ between the root "/" and buy.php
but you will notice the footnote on jdMorgan's post that you should not use the wildcard in a user-agent: * construct because it is not widely supported so you should do
user-agent: googlebot
disallow: /*buy.phpuser-agent: otherbotswhosupportthis
disallow: /*buy.phpuser-agent: *
disallow:
or else you need to
user-agent: *
disallow: /buy.php
disallow: /directory/buy.php
disallow: /dir/buy.php
and if buy.php is the only .php file then there is another way.
disallow: /*.php
# this will disallow all .php files in any directory
same rule applies for user-agent: * though
"Note also that regular expression are not supported in either the User-agent or Disallow lines. The '*' in the User-agent field is a special value meaning "any robot". Specifically, you cannot have lines like "Disallow: /tmp/*" or "Disallow: *.gif"."
So you will have to have
disallow: /abc/buy.php
dissalow: /buy.php
dissalow: /def/buy.php
[google.com...]
How do I block all crawlers except Googlebot from my site?
The following robots.txt file will achieve this
User-agent: Googlebot
Disallow: /*?
How do I block all crawlers except Googlebot from my site?
The following robots.txt file will achieve thisUser-agent: Googlebot
Disallow: /*?
Why would other bots use the directives of user-agent: googlebot?
This robots.txt would just cause all other bots to assume that all is allowed. Same as:
User-agent: Googlebot
Disallow: /*?
User-agent: *
disallow:
You just have to remember that wildcards in the disallow are only supported by a few, including googlebot.
So if you want to stop google from listing your dynamic pages but Yahoo and MSN are OK already then
user-agent: googlebot
disallow: /*?
would have the desired effect.