Welcome to WebmasterWorld Guest from 23.22.250.113

Forum Moderators: Ocean10000 & incrediBILL

Message Too Old, No Replies

Google Feedfetcher

     

aristotle

8:40 pm on Jul 11, 2014 (gmt 0)

WebmasterWorld Senior Member 5+ Year Member Top Contributors Of The Month



Today I noticed Google Feedfetcher getting a page from one of my sites:
Host: 66.249.90.63
/Page.html
Http Code: 200 Date: Jul 11 15:09:39 Http Version: HTTP/1.0 Size in Bytes: 19650
Referer: -
Agent: Mozilla/5.0 (compatible) Feedfetcher-Google;(+http://www.google.com/feedfetcher.html)

I'm a bit puzzled because I've never had any kind of feed on any of my sites. So I'm wondering why Google feedfetcher would want this page. Does anyone have an explanation?

Note: This is a static html page that hasn't been touched in years. It has two images, but apparently they weren't fetched.

aristotle

6:40 pm on Jul 13, 2014 (gmt 0)

WebmasterWorld Senior Member 5+ Year Member Top Contributors Of The Month



Well this Google feedfetcher is still showing up, about every 8-10 hours, always getting the same page.

So now I'm wondering if someone could have created a feed that includes my page. Is that possible? If so, why would anyone do it?

dstiles

6:56 pm on Jul 13, 2014 (gmt 0)

WebmasterWorld Senior Member dstiles is a WebmasterWorld Top Contributor of All Time 5+ Year Member



Not sure but I THINK feedfetcher is triggered by a human who wants to keep tabs on your page(s). I get a few such hits but because the bot shows up on multiple-function IPs with an ambiguous rDNS (in this case a proxy) I usually block the bot.

I suppose "proxy" is another way of representing this but it is G. :(

not2easy

7:07 pm on Jul 13, 2014 (gmt 0)

WebmasterWorld Administrator 5+ Year Member Top Contributors Of The Month



I have had problems seeing referer-spam via this google range, but I see that many of their tools like page-speed insights and javascript optimization also use that proxy. I'm watching it to decide whether it is worse to block it or allow it. I have

Host google-proxy-66-249-80-232.google.com
NetRange: 66.249.64.0 - 66.249.95.255
CIDR: 66.249.64.0/19

for it - but not a clear idea of who/what uses it.

wilderness

7:37 pm on Jul 13, 2014 (gmt 0)

WebmasterWorld Senior Member wilderness is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



There's a similar recent thread [webmasterworld.com]

aristotle

1:08 pm on Jul 14, 2014 (gmt 0)

WebmasterWorld Senior Member 5+ Year Member Top Contributors Of The Month



Since we're talking about Google, I would like to ask about another recent log entry that puzzles me:
Host: 66.249.65.156
/
Http Code: 200 Date: Jul 14 02:20:00 Http Version: HTTP/1.1 Size in Bytes: 44818
Referer: http://example.com/Page.html
Agent: Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)
. . . . .
IP: 66.249.65.156
Hostname: crawl-66-249-65-156.googlebot.com
ISP: Googlebot
Organization: Googlebot
Services: None detected
Type: Corporate
Assignment: Static IP
Country: United States
State/Region: California
City: Mountain View

The referer (elmi.aliexirs.ir) appears to be an Iranian website with a directory structure filled with scraped copies of pages from other websites.

What I'm thinking is that this could be referer spam using a fake googlebot agent, but the IP puzzles me. Can anyone elucidate?

wilderness

1:54 pm on Jul 14, 2014 (gmt 0)

WebmasterWorld Senior Member wilderness is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



There was some mention (a while back) by somebody, whom said that google was going to begin showing some refers on crawled pages.

dstiles

7:21 pm on Jul 14, 2014 (gmt 0)

WebmasterWorld Senior Member dstiles is a WebmasterWorld Top Contributor of All Time 5+ Year Member



Could it be a genuine googlebot running under a "test as googlebot" service? IE a true google service but run under external control.

If this IS the case it's a rather terrifying loophole.

If it's merely G adding an arbitrary referer then G has some serious answers to make to some serious questions!

keyplyr

1:25 am on Jul 15, 2014 (gmt 0)

WebmasterWorld Senior Member keyplyr is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



I've seen many referrers in legit Googlebot requests. Why Googlebot includes referrers sometime is a mystery. In the above situation, I tend to think since this is a valid Googlebot IP, then the UA is authentic. It IS Googlebot and you've luckily been informed that a website has scraped your content (and stupidly left the links.) Now the next step is to figure out what you're going to do about it, given the place of origin.

aristotle

11:51 pm on Jul 15, 2014 (gmt 0)

WebmasterWorld Senior Member 5+ Year Member Top Contributors Of The Month



But if this is a genuine googlebot visit, that raises the question of why it provided a referer in this case but rarely does so in the vast majority of cases. My impression is that googlebot doesn't need a referer, and normally comes on its own, that being the reason that the logs of its visits normally don't show a referer. Yes I know that it supposedly follows links to find new pages, but after it finds a page, it can come on its own. So why did it show a referer in this case?

keyplyr

12:02 am on Jul 16, 2014 (gmt 0)

WebmasterWorld Senior Member keyplyr is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



Googlebot DOES follow links from other sites. One indication of this is the plethora of incoming broken links (404s) reported at GWT. Link following is also a factor in determining Page Rank. So we know that yes, Gogle bot does crawl organically as well as by incoming following links foiund on remote web sites.

As I said, why Googlebot occasionally includes the referring link sometimes is a mystery. Could be by (yet to be determined) design, or a complete fluke. Don't think anyone really knows. Again, I see it a few times each week at a few sites I manage.

wilderness

12:10 am on Jul 16, 2014 (gmt 0)

WebmasterWorld Senior Member wilderness is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



aristotle,
I wouldn't be too concerned unless it happens repeatedly (as keyplr suggested).

I post widget reference links in widget forums and google (and others) pick them up pretty fast and request the page.

aristotle

12:31 am on Jul 16, 2014 (gmt 0)

WebmasterWorld Senior Member 5+ Year Member Top Contributors Of The Month



Thanks for the replies. But I'm really not concerned about any of this, and only brought it up because I wasn't sure if it was a fake googlebot or referer spam or what. And if it's genuine, that doesn't bother me either since the pages on this site have already been scraped numerous times, so that once more won't make any difference.

lucy24

1:49 am on Jul 16, 2014 (gmt 0)

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



google was going to begin showing some refers on crawled pages

Yikes. Do you mean, narrowly and specifically, pages? They often give referers for non-page requests-- lately most often with stylesheets-- but I've never seen them send a referer with a page request.

:: detour to check, thank you very much TextWrangler ::

Nope. Never.
 

Featured Threads

Hot Threads This Week

Hot Threads This Month