homepage Welcome to WebmasterWorld Guest from
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Home / Forums Index / Search Engines / Sitemaps, Meta Data, and robots.txt
Forum Library, Charter, Moderators: goodroi

Sitemaps, Meta Data, and robots.txt Forum

Robots.txt exclude file types

 6:16 am on Jul 29, 2002 (gmt 0)

How do I exclude a particluar file extension from being picked up by robots?
Would this do it?

User-agent: *
Disallow: .asp$



 7:03 am on Jul 29, 2002 (gmt 0)

I've never been able to find any confirming information on this other than what Google states which applies strictly to them.


> 3. I don't want Google to crawl part or all of my site.

There is a standard method involving a "robots.txt" file for excluding robot crawlers. This will prevent Googlebot or other crawlers from visiting your site. Googlebot has a user-agent of "Googlebot". In addition, Googlebot understands some extensions to the robots.txt standard: Disallow patterns may include * to match any sequence of characters, and patterns may end in $ to indicate that the $ must match the end of a name. For example, to prevent Googlebot from crawling files that end in gif, you may use the following robots.txt entry:

User-Agent: Googlebot
Disallow: /*.gif$

P.S. Hmmm, just realized, Google has incorrect syntax shown on that page. The "A" in Agent should not be capitalized. Shame on them! ;)

Global Options:
 top home search open messages active posts  

Home / Forums Index / Search Engines / Sitemaps, Meta Data, and robots.txt
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved