Forum Moderators: open

Message Too Old, No Replies

Mozilla/5.0 (compatible; Google Keyword Tool; +https://adwords.google.

Anyone see this bot crawling lately

         

trinorthlighting

7:21 pm on Oct 22, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Have noticed it visiting pages on my site recently that I do not use adwords on. The IP does resolve to Google in California. Anyone else see it crawling their sites?

trinorthlighting

4:37 am on Nov 6, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



bump

wilderness

8:53 am on Nov 6, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



There's an old thread (although not too old) by myself on the same occurence, although the topic may have been oved to one of the Google forums.

I never did find an appropiate answer.

trinorthlighting

6:31 pm on Nov 6, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



It is strange, I wonder if human evaluators are using it to check pages. I am getting hits everyday from it.

GaryK

7:15 pm on Nov 6, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Was the lack of a closing parenthesis intentional please?

Receptional Andy

7:19 pm on Nov 6, 2008 (gmt 0)



It's the Google Adwords keyword tool [adwords.google.com], when using "website content" (+ optional crawling) as the starting keywords. See Is the legitimate Google? [webmasterworld.com]

[edited by: Receptional_Andy at 7:19 pm (utc) on Nov. 6, 2008]

trinorthlighting

9:14 pm on Nov 6, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Yes, it is the legit google, we already figured it was the keyword tool. But who is using it, human evaluators?

Receptional Andy

9:16 pm on Nov 6, 2008 (gmt 0)



Competitors, or other people researching keywords. I imagine quite a few people will put their target keywords into Google, and then plug some of the top sites into this tool when doing SEO keyword research.

The IP is Google, but it's triggered by their visitors using the tool.

trinorthlighting

12:02 am on Nov 7, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I have a hard time believing that a competitor is going to look at 2000 pages in a month. Especially pages like reviews, and pages that just do not rank what so ever.

I wonder if there are some bots out there using this tool.Does not seem like human behavior that is hitting our site, It does definately look like bot behavior though.

wilderness

12:56 am on Nov 7, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



trinorth,
providing a full IP range for a major SE (Google in this instance) is an acceptable practice.

Were your "bot-like" activities from one of the google tool ranges?
209.84-85?

trinorthlighting

4:33 am on Nov 7, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



They usually come from 74.125.75.1 There are a few other ip's, but they def are google.

I wonder if some competitors have some bots running that use the keyword tool for info.

wilderness

5:13 am on Nov 7, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



trinorth,
You may take the following with "a grain of salt".

Were it my sites that were repeatedly visitet, crawled, etc., whatever!
By the IP range you provided and under the UA of Adwords or which you or your sites are not participating in?

I'd simply deny the range to the Class C. Perhaps higher if the bot returned under another range.

I've portions of Google denied access to my sites and these denials do not affect the continuious crawling from the standard Google bots (i. e., 66.249.xx.zzz)
In addition I've the 209. 84 & 85 Class B's denied as well as the 72.14. Class B.
In addition I've the Google Image bot denied (Aug 2006) over robots.txt excluded crawlings that were supposedly (they did cease for a short while) intervened and corrected manually (the bot returned again within a short period and after the intervention correction).

In the end, we each make our own choices. As you'll be required to do.
Not the logical explantion of Google's reason for grabbing your pages, at least that you were hoping for.

Don

Receptional Andy

10:28 am on Nov 7, 2008 (gmt 0)



It would be quite an advanced bot to get past the captcha and so on, although certainly technologically possible.

Note, though, that as per the thread above, the tool requests 10 URLs in one second if someone chooses the spidering option for keywords. So it doesn't take much human activity to reach 2000 hits.

If you think people are scraping the tool, you could always contact Google. They obviously don't want bots using it, hence the captcha.

Because the UA is consistent, it's easy enough to block, anyhow.

wilderness

3:15 pm on Nov 9, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



In addition I've the 209. 84 & 85 Class B's denied as well as the 72.14. Class B.

My apologies to everybody.

This [webmasterworld.com] should have read:
In addition I've the 66.249. 84 & 85 Class C's denied as well as the 72.14. Class B.

Don