Welcome to WebmasterWorld Guest from

Forum Moderators: goodroi

Message Too Old, No Replies

Robots.txt exclude file types



6:16 am on Jul 29, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member

How do I exclude a particluar file extension from being picked up by robots?
Would this do it?

User-agent: *
Disallow: .asp$


7:03 am on Jul 29, 2002 (gmt 0)

WebmasterWorld Senior Member pageoneresults is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

I've never been able to find any confirming information on this other than what Google states which applies strictly to them.


> 3. I don't want Google to crawl part or all of my site.

There is a standard method involving a "robots.txt" file for excluding robot crawlers. This will prevent Googlebot or other crawlers from visiting your site. Googlebot has a user-agent of "Googlebot". In addition, Googlebot understands some extensions to the robots.txt standard: Disallow patterns may include * to match any sequence of characters, and patterns may end in $ to indicate that the $ must match the end of a name. For example, to prevent Googlebot from crawling files that end in gif, you may use the following robots.txt entry:

User-Agent: Googlebot
Disallow: /*.gif$

P.S. Hmmm, just realized, Google has incorrect syntax shown on that page. The "A" in Agent should not be capitalized. Shame on them! ;)


Featured Threads

Hot Threads This Week

Hot Threads This Month