Well this Google feedfetcher is still showing up, about every 8-10 hours, always getting the same page.
So now I'm wondering if someone could have created a feed that includes my page. Is that possible? If so, why would anyone do it?
Not sure but I THINK feedfetcher is triggered by a human who wants to keep tabs on your page(s). I get a few such hits but because the bot shows up on multiple-function IPs with an ambiguous rDNS (in this case a proxy) I usually block the bot.
I suppose "proxy" is another way of representing this but it is G. :(
NetRange: 126.96.36.199 - 188.8.131.52
for it - but not a clear idea of who/what uses it.
There's a similar recent thread [webmasterworld.com]
Since we're talking about Google, I would like to ask about another recent log entry that puzzles me:
|Host: 184.108.40.206 |
Http Code: 200 Date: Jul 14 02:20:00 Http Version: HTTP/1.1 Size in Bytes: 44818
Agent: Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)
. . . . .
Services: None detected
Assignment: Static IP
Country: United States
City: Mountain View
The referer (elmi.aliexirs.ir) appears to be an Iranian website with a directory structure filled with scraped copies of pages from other websites.
What I'm thinking is that this could be referer spam using a fake googlebot agent, but the IP puzzles me. Can anyone elucidate?
There was some mention (a while back) by somebody, whom said that google was going to begin showing some refers on crawled pages.
Could it be a genuine googlebot running under a "test as googlebot" service? IE a true google service but run under external control.
If this IS the case it's a rather terrifying loophole.
If it's merely G adding an arbitrary referer then G has some serious answers to make to some serious questions!
I've seen many referrers in legit Googlebot requests. Why Googlebot includes referrers sometime is a mystery. In the above situation, I tend to think since this is a valid Googlebot IP, then the UA is authentic. It IS Googlebot and you've luckily been informed that a website has scraped your content (and stupidly left the links.) Now the next step is to figure out what you're going to do about it, given the place of origin.
But if this is a genuine googlebot visit, that raises the question of why it provided a referer in this case but rarely does so in the vast majority of cases. My impression is that googlebot doesn't need a referer, and normally comes on its own, that being the reason that the logs of its visits normally don't show a referer. Yes I know that it supposedly follows links to find new pages, but after it finds a page, it can come on its own. So why did it show a referer in this case?
Googlebot DOES follow links from other sites. One indication of this is the plethora of incoming broken links (404s) reported at GWT. Link following is also a factor in determining Page Rank. So we know that yes, Gogle bot does crawl organically as well as by incoming following links foiund on remote web sites.
As I said, why Googlebot occasionally includes the referring link sometimes is a mystery. Could be by (yet to be determined) design, or a complete fluke. Don't think anyone really knows. Again, I see it a few times each week at a few sites I manage.
I wouldn't be too concerned unless it happens repeatedly (as keyplr suggested).
I post widget reference links in widget forums and google (and others) pick them up pretty fast and request the page.
Thanks for the replies. But I'm really not concerned about any of this, and only brought it up because I wasn't sure if it was a fake googlebot or referer spam or what. And if it's genuine, that doesn't bother me either since the pages on this site have already been scraped numerous times, so that once more won't make any difference.
|google was going to begin showing some refers on crawled pages |
Yikes. Do you mean, narrowly and specifically, pages? They often give referers for non-page requests-- lately most often with stylesheets-- but I've never seen them send a referer with a page request.
:: detour to check, thank you very much TextWrangler ::