Welcome to WebmasterWorld Guest from 54.167.82.170

Forum Moderators: goodroi

Message Too Old, No Replies

how to disallow feed in robot.txt readable by all bots

     

bhavana

5:34 am on Aug 16, 2011 (gmt 0)



how to disallow feeds in robots.txt i tried disallow /feed/ it is not working how disallow url ending with feed
is it disallow /feed$ or disallow /feed/$

tangor

5:56 am on Aug 16, 2011 (gmt 0)

WebmasterWorld Senior Member tangor is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month



Either are correct, but bad bots will ignore... and will also use those "hints" as to what to rip.

feed all by itself will work for bots that honor...

lucy24

6:08 am on Aug 16, 2011 (gmt 0)

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



:: cough, cough ::

Both $ forms are incorrect in robots.txt [robotstxt.org], because it doesn't "do" Regular Expressions.

Note also that globbing and regular expression are not supported in either the User-agent or Disallow lines. The '*' in the User-agent field is a special value meaning "any robot".


So if

Disallow: /feed/

isn't working, you need to bring out the heavy artillery, starting with .htaccess.

tangor

7:18 am on Aug 16, 2011 (gmt 0)

WebmasterWorld Senior Member tangor is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month



Correct, the regex $ is not required. Done. Otherwise, it correct. Best method is to disallow ALL BOTS then list which bots ARE ALLOWED, but that put me in the minority (called whitelisting)...

phranque

4:42 am on Aug 17, 2011 (gmt 0)

WebmasterWorld Administrator phranque is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



while the robots exclusion protocol doesn't support globbing or regular expressions, many search engines (including G) support pattern matching extensions:
http://www.google.com/support/webmasters/bin/answer.py?answer=156449 [google.com]

the more important issue for your problem statement is that the Disallow syntax matches the url path left-to-right.

therefore if you want to take advantage of REP extensions to pattern matching you can disallow a url ending with "feed" using:
Disallow: /*feed$

however if you also/instead want to disallow a "feed" subdirectory url (i.e. ending with "feed/") you need a different rule:
Disallow: /*feed/$

also note that without the end anchor in the pattern (the "$") you will match more than intended, such that disallowing the pattern "/*feed" will disallow urls such as "/feedme" and disallowing the pattern "/*feed/" will disallow urls such as "/feed/me"
 

Featured Threads

Hot Threads This Week

Hot Threads This Month