Welcome to WebmasterWorld Guest from 188.8.131.52 , register , free tools , login , search , pro membership , help , library , announcements , recent posts , open posts Become a Pro Member
Googlebot Wildcard payday
will my robots.txt that look like this:
Disallow: *.js Disallow: *.css
prevent Google to spider all *.js and *.css files on my server? Also these ones that are in subdirectories? Is there any difference to:
Disallow: /*.js Disallow: /*.css
Will this prevent Google to spider only all *.js and *.css files in my root directory?
The only place wildcards are supported in the robots exclusion protocol [ robotstxt.org] is in the User-agent variable.
"Disallow: *.js" will prevent the file named "*.js" being spidered.
"Disallow: /*.js" will prevent the directory named "*.js" being spidered.
<added>I nearly forgot! Welcome to WebmasterWorld, payday! :)</added>
Thanks for reply, but I think googlebot support wildcards. What is the right syntax, only for googlebot, to disallow indexing all *.js files on my server, also in all subdirectorys? Is this possible? DaveAtIFG
GoogleGuy posted this: [ ...] webmasterworld.com Google says this: [ ...] google.com
Maybe else someone has more detailed info?
From another item [ google.com] in the Google FAQ:
Googlebot also understands some extensions to the robots.txt standard. Disallow patterns may include * to match any sequence of characters, and patterns may end in $ to indicate the end of a name. For example, to prevent Googlebot from crawling files that end in .gif, you may use the following robots.txt entry: [pre] User-Agent: Googlebot Disallow: /*.gif$[/pre]
In a previous post someone stated that Googlebot is the
only robot to accept these extensions, so using them will not keep other bots out of these pages.