homepage Welcome to WebmasterWorld Guest from
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Home / Forums Index / Search Engines / Sitemaps, Meta Data, and robots.txt
Forum Library, Charter, Moderators: goodroi

Sitemaps, Meta Data, and robots.txt Forum

Googlebot Wildcard

 8:26 pm on Nov 17, 2003 (gmt 0)


will my robots.txt that look like this:

User-agent: Googlebot
Disallow: *.js
Disallow: *.css

prevent Google to spider all *.js and *.css files on my server? Also these ones that are in subdirectories? Is there any difference to:

User-agent: Googlebot
Disallow: /*.js
Disallow: /*.css

Will this prevent Google to spider only all *.js and *.css files in my root directory?



 4:18 pm on Nov 20, 2003 (gmt 0)

The only place wildcards are supported in the robots exclusion protocol [robotstxt.org] is in the User-agent variable.

"Disallow: *.js" will prevent the file named "*.js" being spidered.

"Disallow: /*.js" will prevent the directory named "*.js" being spidered.

<added>I nearly forgot! Welcome to WebmasterWorld, payday! :)</added>


 8:05 pm on Nov 21, 2003 (gmt 0)

Thanks for reply, but I think googlebot support wildcards. What is the right syntax, only for googlebot, to disallow indexing all *.js files on my server, also in all subdirectorys? Is this possible?


 11:43 pm on Nov 21, 2003 (gmt 0)

GoogleGuy posted this:
Google says this:

Maybe else someone has more detailed info?


 11:26 am on Nov 22, 2003 (gmt 0)

From another item [google.com] in the Google FAQ:

Googlebot also understands some extensions to the robots.txt standard. Disallow patterns may include * to match any sequence of characters, and patterns may end in $ to indicate the end of a name. For example, to prevent Googlebot from crawling files that end in .gif, you may use the following robots.txt entry:

User-Agent: Googlebot
Disallow: /*.gif$[/pre]

In a previous post someone stated that Googlebot is the only robot to accept these extensions, so using them will not keep other bots out of these pages.

Global Options:
 top home search open messages active posts  

Home / Forums Index / Search Engines / Sitemaps, Meta Data, and robots.txt
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved