homepage Welcome to WebmasterWorld Guest from 54.204.68.109
register, free tools, login, search, subscribe, help, library, announcements, recent posts, open posts,
Subscribe to WebmasterWorld
Home / Forums Index / Search Engines / Sitemaps, Meta Data, and robots.txt
Forum Library, Charter, Moderators: goodroi

Sitemaps, Meta Data, and robots.txt Forum

    
Robots.txt exclude file types
Lisa




msg:1527257
 6:16 am on Jul 29, 2002 (gmt 0)

How do I exclude a particluar file extension from being picked up by robots?
Would this do it?

User-agent: *
Disallow: .asp$

 

pageoneresults




msg:1527258
 7:03 am on Jul 29, 2002 (gmt 0)

I've never been able to find any confirming information on this other than what Google states which applies strictly to them.

[google.com...]

> 3. I don't want Google to crawl part or all of my site.

There is a standard method involving a "robots.txt" file for excluding robot crawlers. This will prevent Googlebot or other crawlers from visiting your site. Googlebot has a user-agent of "Googlebot". In addition, Googlebot understands some extensions to the robots.txt standard: Disallow patterns may include * to match any sequence of characters, and patterns may end in $ to indicate that the $ must match the end of a name. For example, to prevent Googlebot from crawling files that end in gif, you may use the following robots.txt entry:

User-Agent: Googlebot
Disallow: /*.gif$

P.S. Hmmm, just realized, Google has incorrect syntax shown on that page. The "A" in Agent should not be capitalized. Shame on them! ;)

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Search Engines / Sitemaps, Meta Data, and robots.txt
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About
© Webmaster World 1996-2014 all rights reserved