Welcome to WebmasterWorld Guest from 107.20.75.63

Forum Moderators: goodroi

Message Too Old, No Replies

how to disallow feed in robot.txt readable by all bots

     
5:34 am on Aug 16, 2011 (gmt 0)

New User

joined:Aug 16, 2011
posts:1
votes: 0


how to disallow feeds in robots.txt i tried disallow /feed/ it is not working how disallow url ending with feed
is it disallow /feed$ or disallow /feed/$
5:56 am on Aug 16, 2011 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member tangor is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Nov 29, 2005
posts:6153
votes: 284


Either are correct, but bad bots will ignore... and will also use those "hints" as to what to rip.

feed all by itself will work for bots that honor...
6:08 am on Aug 16, 2011 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month

joined:Apr 9, 2011
posts:12714
votes: 244


:: cough, cough ::

Both $ forms are incorrect in robots.txt [robotstxt.org], because it doesn't "do" Regular Expressions.

Note also that globbing and regular expression are not supported in either the User-agent or Disallow lines. The '*' in the User-agent field is a special value meaning "any robot".


So if

Disallow: /feed/

isn't working, you need to bring out the heavy artillery, starting with .htaccess.
7:18 am on Aug 16, 2011 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member tangor is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Nov 29, 2005
posts:6153
votes: 284


Correct, the regex $ is not required. Done. Otherwise, it correct. Best method is to disallow ALL BOTS then list which bots ARE ALLOWED, but that put me in the minority (called whitelisting)...
4:42 am on Aug 17, 2011 (gmt 0)

Administrator

WebmasterWorld Administrator phranque is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Aug 10, 2004
posts:10544
votes: 8


while the robots exclusion protocol doesn't support globbing or regular expressions, many search engines (including G) support pattern matching extensions:
http://www.google.com/support/webmasters/bin/answer.py?answer=156449 [google.com]

the more important issue for your problem statement is that the Disallow syntax matches the url path left-to-right.

therefore if you want to take advantage of REP extensions to pattern matching you can disallow a url ending with "feed" using:
Disallow: /*feed$

however if you also/instead want to disallow a "feed" subdirectory url (i.e. ending with "feed/") you need a different rule:
Disallow: /*feed/$

also note that without the end anchor in the pattern (the "$") you will match more than intended, such that disallowing the pattern "/*feed" will disallow urls such as "/feedme" and disallowing the pattern "/*feed/" will disallow urls such as "/feed/me"
 

Join The Conversation

Moderators and Top Contributors

Hot Threads This Week

Featured Threads

Free SEO Tools

Hire Expert Members