homepage Welcome to WebmasterWorld Guest from
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Pubcon Platinum Sponsor 2014
Visit PubCon.com
Home / Forums Index / Search Engines / Sitemaps, Meta Data, and robots.txt
Forum Library, Charter, Moderators: goodroi

Sitemaps, Meta Data, and robots.txt Forum

how to disallow feed in robot.txt readable by all bots

 5:34 am on Aug 16, 2011 (gmt 0)

how to disallow feeds in robots.txt i tried disallow /feed/ it is not working how disallow url ending with feed
is it disallow /feed$ or disallow /feed/$



 5:56 am on Aug 16, 2011 (gmt 0)

Either are correct, but bad bots will ignore... and will also use those "hints" as to what to rip.

feed all by itself will work for bots that honor...


 6:08 am on Aug 16, 2011 (gmt 0)

:: cough, cough ::

Both $ forms are incorrect in robots.txt [robotstxt.org], because it doesn't "do" Regular Expressions.

Note also that globbing and regular expression are not supported in either the User-agent or Disallow lines. The '*' in the User-agent field is a special value meaning "any robot".

So if

Disallow: /feed/

isn't working, you need to bring out the heavy artillery, starting with .htaccess.


 7:18 am on Aug 16, 2011 (gmt 0)

Correct, the regex $ is not required. Done. Otherwise, it correct. Best method is to disallow ALL BOTS then list which bots ARE ALLOWED, but that put me in the minority (called whitelisting)...


 4:42 am on Aug 17, 2011 (gmt 0)

while the robots exclusion protocol doesn't support globbing or regular expressions, many search engines (including G) support pattern matching extensions:
http://www.google.com/support/webmasters/bin/answer.py?answer=156449 [google.com]

the more important issue for your problem statement is that the Disallow syntax matches the url path left-to-right.

therefore if you want to take advantage of REP extensions to pattern matching you can disallow a url ending with "feed" using:
Disallow: /*feed$

however if you also/instead want to disallow a "feed" subdirectory url (i.e. ending with "feed/") you need a different rule:
Disallow: /*feed/$

also note that without the end anchor in the pattern (the "$") you will match more than intended, such that disallowing the pattern "/*feed" will disallow urls such as "/feedme" and disallowing the pattern "/*feed/" will disallow urls such as "/feed/me"

Global Options:
 top home search open messages active posts  

Home / Forums Index / Search Engines / Sitemaps, Meta Data, and robots.txt
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved