Welcome to WebmasterWorld Guest from 23.22.182.29

Forum Moderators: goodroi

Message Too Old, No Replies

wildcard for googlebot

     
1:04 pm on Nov 4, 2005 (gmt 0)

Junior Member

10+ Year Member

joined:Mar 11, 2005
posts:45
votes: 0


I would like googlebot to stop indexing all urls on the below path but where the wild card is that directery changes all the time is this ok and will it follow the rule

User-agent: Googlebot
Disallow: /index.php/cPath/*/sort/

Thanks I hope I have explained this properly

12:04 pm on Nov 5, 2005 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:May 22, 2002
posts:1001
votes: 0


Wildcards don't exist in robots.txt

Just


Disallow: /index.php/cPath/

That will ban everything below cPath.

[robotstxt.org...]

8:30 am on Nov 7, 2005 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Dec 5, 2002
posts:1562
votes: 0


As it's a Googlebot specific question:
Google does allow wildcards as you can see at [google.com...]
4:02 pm on Nov 7, 2005 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:May 22, 2002
posts:1001
votes: 0


True, although I personally feel that once we start using individual rules for each search engine then we may as well through the rule book in the bin. Also, don't forget that these aren't real pages we're talking about, they're dynamic.

In reality, as the above file is dynamically generated, IMO the best (easiest?) way to block spidering would be to include <META NAME="GOOGLEBOT" CONTENT="NOINDEX, NOFOLLOW"> into the page head.