Forum Moderators: open

Message Too Old, No Replies

cgi bad-bot script

interesting ip

         

dolcevita

3:41 pm on Apr 17, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



It is interesting to me and i'm surprise to see this into .htaccess log :

The ip address ^64\.233\.172\.21$ has been banned on Tue Apr 17 11:24:06 2007
The associated user agent was Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; InfoPath.1; .NET CLR 1.1.4322; .NET CLR 2.0.50727; .NET CLR 3.0.04506.30)
The associated referer was ......................

How is possible because it is google ip address surfing behind MSIE 7.0 and ignoring robots.txt?

dolcevita

3:54 pm on Apr 17, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



btw just found this on wikipedia
---------------------
THe IP Address 64.233.172.21 is used by thousands of people in India who access Internet with Data One, The ISP of Government of India.
As such, this block hurts the genuine users

However, a WHOIS lookup on 64.233.172.21 appears to show that this address is part of the address block 64.233.160.0/19, issued by ARIN to Google. -- The Anome 07:42, 21 July 2006 (UTC)
-----------------------

How it can be possible when range 64.233.160.0 - 64.233.191.255 belong to google?
Any idea?

64.233.172.21
Host reachable, 1914 ms. average

64.233.160.0 - 64.233.191.255

Google Inc.
1600 Amphitheatre Parkway
Mountain View
CA
94043
United States

Google Inc.
+1-650-318-0200
arin-contact@google.com

NS1.GOOGLE.COM
NS2.GOOGLE.COM
NS3.GOOGLE.COM
NS4.GOOGLE.COM

GOOGLE
Created: 2003-08-18
Updated: 2007-04-10
Source: whois.arin.net

Dabrowski

4:07 pm on Apr 17, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I don't know if Google maybe provide their network?

I run a few internet servers, and my IP range is 80.x.x.x/248 but a lookup will still show my ISP.

I know that sounds silly and Google don't do ISP as far as I know, but who knows what they do in other countries?

dolcevita

4:12 pm on Apr 17, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Thanks but any idea what to do. Block ip or allow it?

Dabrowski

4:26 pm on Apr 17, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



THe IP Address 64.233.172.21 is used by thousands of people in India who access Internet with Data One, The ISP of Government of India.
As such, this block hurts the genuine users

Seems you answered your own question. Why block it, what has it done wrong?

wilderness

4:29 pm on Apr 17, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



any idea what to do. Block ip or allow it?

You need to decide on your own what is beneficial or detrimental to your website (s).

Google Web Accelerator
or
Google Translator
or
one of the many tool utilities that Google offers users.

AlexK

5:00 pm on Apr 17, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Humans also surf from Google IPs. It's not all machines there.

dolcevita

6:46 pm on Apr 17, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



But people it was some kind of Bot because he ignored rules from robots.txt and was blocked by bad bot trap script.
Actually he visited what was forbidden and small link image 1x1 what was invisible for human eyes.

Dabrowski

6:49 pm on Apr 17, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



So.....he ignored robots.txt....and is using IE7.....refer to post about humans.

blend27

8:41 pm on Apr 17, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I have the same situation here from slightly different IPs with

Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; FunWebProducts; .NET CLR 1.1.4322)
Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.3) Gecko/20070309 Firefox/2.0.0.3

who got cought by 1x1 pixel

wilderness

9:15 pm on Apr 17, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Personally I use:
RewriteCond %{REMOTE_ADDR} ^64\.233\.(1[678][0-9]¦19[01])\. [OR]

I don't like visitors coming in from an IP that is not actually their own.

dolcevita

1:00 am on Apr 18, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Hmmm it is strange that another ip 64.233.173.86 that belong to google ip range was again bot-trapped.
=================
This time The ip address ^64\.233\.173\.86$ has been banned on Tue Apr 17 17:31:49 2007
The associated user agent was Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322)
The associated referer was
==============

It could be Google accellator but still do not know of block or not these ip which does not respect the robots.txt!?
Because any bot caught spidering this site violating the robots.txt standard should be considered abuse and violating bots.

Dabrowski

1:42 am on Apr 18, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I don't like visitors coming in from an IP that is not actually their own

So you block anyone behind a corporate firewall, NAT server, or even using ICS at home?

It could be Google accellator

Why don't you test it? Use the tool yourself and see if that IP shows up at the time. It's reasonable for an SEO type tool to browse your site as a user would?

incrediBILL

3:29 am on Apr 19, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Google also has language translation tools and WAP services so blocking all things Google isn't very clever as lots of real humans come from those IPs, including Google employees of which there are a bunch.

I block anything that claims to be Googlebot or media-partners that doesn't originate from googlebot.com and everything else is automatically monitored for bad behavior just like I would any other IP address.

Key_Master

4:21 am on Apr 19, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



If it's Google Web Accelerator, look for a "X-moz: prefetch" header from your visitors.

You can block "prefetch" requests from triggering your bot trap by adding the following code to your .htaccess file (Apache servers). This way you wont inadvertently block innocent visitors from your site.

RewriteEngine on
Options +FollowSymLinks
RewriteCond %{HTTP:X-moz} ^prefetch [NC]
RewriteCond %{REQUEST_URI} ^/path_to_your_bad_bot_script\.cgi$ [NC]
RewriteRule .* [F]

More info about Google Prefetch:
[webaccelerator.google.com...]