Page is a not externally linkable
- Google
-- Google SEO News and Discussion
---- Questioning the wisdom of using fat pings to deal with scrapers


incrediBILL - 12:11 am on Nov 27, 2012 (gmt 0)


Why would you add the full RSS as a sitemap (and risk that I just might be discovered by scrapers, too) if Google only reads URLs from sitemaps?


You can easily limit your RSS feed's access to just a whitelisted number of sites, like Google, Bing, etc. using .htaccess assuming you have mod_access installed which is pretty common.

Here's an untested example of how you might do this just for Google:

<FilesMatch "\.rss$">
Header set X-Robots-Tag "noarchive, nosnippet"
Order allow,deny
#allow google IPs
allow 64.18.0.0/20 64.233.160.0/19 66.102.0.0/20 66.249.80.0/20
allow 72.14.192.0/18 74.125.0.0/16 173.194.0.0/16
allow 207.126.144.0/20 209.85.128.0/17 216.239.32.0/19
deny all
</FilesMatch>


I threw in a NOARCHIVE to make sure it's not cached and accessible by scrapers in the search engine as well.

These things aren't that hard to do once you learn how to do it.


Thread source:: http://www.webmasterworld.com/google/4520212.htm
Brought to you by WebmasterWorld: http://www.webmasterworld.com