Forum Moderators: open
IP is 72.14.199.x
Just recently I moved one of my sites from shared hosting to VPS with dedicated IP.
I believe that this IP belongs to Google, does it?
Anyhow, I'm getting bunch of 404s as this UA is visiting every day asking for stuff like
gallery2/main.php?g2_view=rss.SimpleRender&g2_itemId=16
How do I fix this? Is there a way to "tell" Google to stop asking fro this RSS feed or whatever this is?
Is there a harm if I ban it? Robots.txt or .htaccess?
Thanks
The requests are for a file that is part of Gallery, a popular PHP image gallery package.
If you don't have Gallery installed (possibly the previous IP user did) you can try:
RewriteCond %{HTTP_HOST} .
RewriteRule ^gallery - [G]
This should return a 410 Removed Permanently response. Unfortunately Google tends to treat it the same as a 404 and will probably continue to request it for several months before giving up.
If you have no RSS feeds you want indexed you can safely block the bot:
RewriteCond %{HTTP_USER_AGENT} Feedfetcher
RewriteRule .* - [F]
...
[webmasterworld.com...]
...
No "User Agent" was not part of UA string itself.
I don;t use this so I blocked it via .htaccess. I get too many 404s every day. I guess it may be past submission or a wrong one as it is requesting via IP address, not my site's name.
Can this be abused? I mean, is it possible that someone would try harming your site that way, by submitting your IP to, in this case, Feedfetcher?
Can this be abused?
I wouldn't want to give a definitive answer, but the only tools I have seen using the 72.14.199.nn IP range are Feedfetcher and Googlebot-Mobile (both of which are also known to use 209.85.238.nn). I don't block them myself, and doubt that Feedfetcher itself could be the source of mischief.
But then I am constantly surprised by what is possible.
...
That seems to suggest you, or someone with your login credentials, has to submit the site. So it's probably what Samizdata suggested: The feed from whomever had the IP before you did is attracting Feedfetcher and it will eventually give up and go away with no harm done.
Wouldn't the only way to submit the feed to Google be via Webmaster Tools?
No, The personal page iGoogle (google.com/ig) allows you to add RSS feeds into your personal home page and Feedfetcher is what Google sends as the UA when that page requests feeds from a site.
It might just be one person who added something from that site to their iGoogle page.
[google.com...]
I won't bother to quote it, as I'm sure the interested parties here will read it thoroughly.
...
the official Google Feedfetcher FAQ
I was there on the day before I initiated this post. All is good except this:
How do I request that Google not retrieve some or all of my site's feeds?Since Feedfetcher requests are all user-initiated, it does not follow the typical robots.txt guidelines for robots. For detailed instructions about how to prevent Feedfetcher from requesting all or part of your site, please see our Removals page.
There is nothing on removals page in a relation to the Feedfetcher.
When I saw that first time, naturally (organically), I entered the term "feedfetcher" into the search box and got this:
Your search - feedfetcher - did not match any answers in our Help Center.Please edit your search terms and try again.
Oh man...?!
I "killed" it in .htaccess via banning the user agent.
I sympathise with your view - few webmasters use a robots.txt file, but those who do would probably prefer it to apply to all non-human requests. The people who make bots do not like being thwarted though, and can interpret the (voluntary) protocol to suit themselves.
It's a jungle out there.
...