Forum Moderators: DixonJones

Message Too Old, No Replies

Google?

was it a bot?

         

RobBroekhuis

11:49 am on Aug 9, 2004 (gmt 0)

10+ Year Member



Something interesting hit my logs this morning - a short string of accesses from an IP address assigned to Google Inc. (but reverse DNS fails, and it's not in the range of IP addresses that Googlebot uses), with a regular MSIE user agent string. Grabbed my home page, and robots.txt on both of my domains. What might this mean? I wouldn't even bother asking if I hadn't been a little paranoid about Google lately - the majority of my info pages seem to have fallen out of their index altogether - all recent bot activity has been from msn and Yahoo, none from Google.
Any thoughts?
Rob

bobothecat

12:14 pm on Aug 9, 2004 (gmt 0)



Saw the same thing yesterday. Picked up the index page, including images - and then robots.txt

64.233.***.*** - - [08/Aug/2004:07:36:14 -0500] "GET /robots.txt HTTP/1.1" 200 4708 "-" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)"

Only happend with one of my sites, no visits like this to others.

[edited by: Brett_Tabke at 12:19 pm (utc) on Aug. 9, 2004]

bobothecat

7:50 pm on Aug 9, 2004 (gmt 0)



Hmm... odd that we can post IP addresses for *new* MSN bots, but not Google:

[webmasterworld.com...]

Anyhow... if anyone's interested, sticky me and I'll give you the full IP.

RobBroekhuis

8:05 pm on Aug 9, 2004 (gmt 0)

10+ Year Member



For what it's worth, the first two IP numbers match the ones from my request (see original post).
Rob

fiestagirl

8:48 pm on Aug 9, 2004 (gmt 0)

10+ Year Member



Discussion here:
[webmasterworld.com...]

bobothecat

8:58 pm on Aug 9, 2004 (gmt 0)



>Discussion here:

Actually I disagree since mediabot doesn't grab images, this one came in like a 'real' user - then requested robots.txt - plus the site in question does not run AdSense, so would think there would be little reason for this bot to visit.

To the best of my knowledge mediabot doesn't use the following UA:

"Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)" - last I saw, this was the UA: "Mediapartners-Google/2.1"

Unless there's been a change, I'm afraid you may be wrong in regards to this subject, but thanks for sharing.

[edited by: bobothecat at 9:45 pm (utc) on Aug. 9, 2004]

bobothecat

9:30 pm on Aug 9, 2004 (gmt 0)



Here's a snippet from today's logs from another site... with AdSense from mediabot - clearly different behavior:

64.68.xx.xx - - [09/Aug/2004:02:24:33 -0500] "GET /robots.txt HTTP/1.1" 200 4614 "-" "Mediapartners-Google/2.1"
64.233.****.xxx - - [09/Aug/2004:02:24:34 -0500] "GET / HTTP/1.1" 200 29352 "-" "Mediapartners-Google/2.1"

fiestagirl

4:54 pm on Aug 10, 2004 (gmt 0)

10+ Year Member



I don't think that you read the thread that I directed you to.

"They are using a normal browser user agent."
"getting css on a regular basis"
"The same IP group that appears as the Mediapartners bot" (not the MediaPartners Bot)

TheDoctor

1:43 pm on Aug 19, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Got scraped by what I think must be the same bot yesterday. It was identified by the the perfectly innocent "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)" (as above), but it downloaded my entire site in ledd than forty minutes - so I don't think it was human-operated.

It came from an IP address that's logged as belonging to Verio.

FWIW it was immediately preceeded (less than a minute earlier) by three requests by an agent identifying itself only as "JScript Processor" from the same IP address. I've not had any other requests from this IP address at any other point this month, so I sort of associate the two sets of requests.

Romeo

3:14 pm on Aug 19, 2004 (gmt 0)

10+ Year Member



Hi Rob,

could be serveral:
-- an innocent worker at Google, just surfing along?
-- someone at Google handchecking your site?
-- a new undercover bot to get the cloakers?

Anyway, thanks for the heads-up. Will check my logs later, too.

Regards,
R.