homepage Welcome to WebmasterWorld Guest from
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Visit PubCon.com
Home / Forums Index / Search Engines / Sitemaps, Meta Data, and robots.txt
Forum Library, Charter, Moderators: goodroi

Sitemaps, Meta Data, and robots.txt Forum

wildcard for googlebot

 1:04 pm on Nov 4, 2005 (gmt 0)

I would like googlebot to stop indexing all urls on the below path but where the wild card is that directery changes all the time is this ok and will it follow the rule

User-agent: Googlebot
Disallow: /index.php/cPath/*/sort/

Thanks I hope I have explained this properly



 12:04 pm on Nov 5, 2005 (gmt 0)

Wildcards don't exist in robots.txt


Disallow: /index.php/cPath/

That will ban everything below cPath.



 8:30 am on Nov 7, 2005 (gmt 0)

As it's a Googlebot specific question:
Google does allow wildcards as you can see at [google.com...]


 4:02 pm on Nov 7, 2005 (gmt 0)

True, although I personally feel that once we start using individual rules for each search engine then we may as well through the rule book in the bin. Also, don't forget that these aren't real pages we're talking about, they're dynamic.

In reality, as the above file is dynamically generated, IMO the best (easiest?) way to block spidering would be to include <META NAME="GOOGLEBOT" CONTENT="NOINDEX, NOFOLLOW"> into the page head.

Global Options:
 top home search open messages active posts  

Home / Forums Index / Search Engines / Sitemaps, Meta Data, and robots.txt
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved