homepage Welcome to WebmasterWorld Guest from 23.23.12.202
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
Forum Library, Charter, Moderators: Ocean10000 & incrediBILL

Search Engine Spider and User Agent Identification Forum

    
New IPs -> google
littleman

WebmasterWorld Senior Member littleman us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 133 posted 6:07 pm on Oct 5, 2000 (gmt 0)

Name: crawler4.googlebot.com - Address: 64.208.37.74
Name: crawler7.googlebot.com - Address: 64.208.37.53
Name: crawler8.googlebot.com - Address: 64.208.37.61
Name: crawl-56.googlebot.com - Address: 64.208.37.56
Name: crawl-19.googlebot.com - Address: 64.208.37.19

 

Gorufu

10+ Year Member



 
Msg#: 133 posted 10:09 pm on Oct 5, 2000 (gmt 0)

Hi littleman

Something very strange is happening with Google spiders. Their machine hostnames and IP's appear to be changing all the time.

These Google spiders have visited my domain this month
crawl.googlebot.com
crawl-7.googlebot.com
crawl-55.googlebot.com
crawler4.googlebot.com

What software are you using to find the IP numbers?
All the crawl-xx spiders were unknown hosts for me, using CyberKit (Win95) and telnet traceroute (FreeBSD O/S)

I just did a telnet traceroute 4 consecutive times on crawler8.googlebot.com and it was a different IP each time and none of them were responding.

tee-up:/# traceroute crawler8.googlebot.com
traceroute to crawler8.googlebot.com (64.208.36.69), 30 hops max, 40 byte packets
12 crawler3.googlebot.com (64.209.181.54) 3485.247 ms !H 3998.174 ms !H
tee-up:/# traceroute crawler8.googlebot.com
traceroute to crawler8.googlebot.com (64.208.36.78), 30 hops max,
12 crawler3.googlebot.com (64.209.181.54) 3573.358 ms !H 3998.512 ms !H
tee-up:/# traceroute crawler8.googlebot.com
traceroute to crawler8.googlebot.com (64.208.36.67), 30 hops max, 40 byte packets
12 crawler3.googlebot.com (64.209.181.54) 3699.230 ms !H 3998.808 ms !H
tee-up:/# traceroute crawler8.googlebot.com
traceroute to crawler8.googlebot.com (64.208.36.64), 30 hops max, 40 byte packets
12 crawler3.googlebot.com (64.209.181.54) 3515.124 ms !H 3999.294 ms !H

Using CyberKit NS Lookup the following IP's were found
Hostname: crawler8.googlebot.com
IP Address: 64.208.36.61
IP Address: 64.208.36.62
IP Address: 64.208.36.63
IP Address: 64.208.36.64
IP Address: 64.208.36.65
IP Address: 64.208.36.66
IP Address: 64.208.36.67
IP Address: 64.208.36.68
IP Address: 64.208.36.69
IP Address: 64.208.36.70
IP Address: 64.208.36.71
IP Address: 64.208.36.72
IP Address: 64.208.36.73
IP Address: 64.208.36.74
IP Address: 64.208.36.75
IP Address: 64.208.36.76
IP Address: 64.208.36.77
IP Address: 64.208.36.78
IP Address: 64.208.36.79
IP Address: 64.208.36.80

None of the 64.208.37.xx IP's are in the above list yet CyberKit results were

Hostname: crawler8.googlebot.com
IP Address: 64.208.37.61
Hostname: crawler8.googlebot.com
IP Address: 64.208.37.74

Use the following NetBlock lookup link and enter a known Google IP
[arin.net...]

Google NetBlock 64.208.32.0 - 64.208.39.255
Global Crossing NetBlock 64.208.0.0 - 64.213.255.255


littleman

WebmasterWorld Senior Member littleman us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 133 posted 11:14 pm on Oct 5, 2000 (gmt 0)

>What software are you using to find the IP numbers?
Home grown stuff. I actually got these out of my logs.

>I just did a telnet traceroute 4 consecutive times on crawler8.googlebot.com and it was a different IP each time and
none of them were responding.

Yeah I know!
Interestingly I could resolve the IP [cgi-fun.hypermart.net] and get a different host name then what hit my logs. And then if I do a
dnsquery [cgi-fun.hypermart.net] on that name I get 30 ips with the same host name - which is in contrast with the other host names/IP assignments I get from an NS
lookup on a specific IP. They must have one heck of a redundancy network.

But if you go and do an NSlookup on those IPs you will get a fixed host name. That is what I would go by in your records.

PeteU

10+ Year Member



 
Msg#: 133 posted 1:28 am on Oct 6, 2000 (gmt 0)


64.208.32.0 - 64.208.39.255 this whole IP block belongs to
Google, INC

methinks it would be pretty safe to show google spider pages to all requests coming from this IP range and with Googlebot UA. Hostnames? nah.. they only slow down servers
;)

littleman

WebmasterWorld Senior Member littleman us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 133 posted 1:31 am on Oct 6, 2000 (gmt 0)

Yup.

Gorufu

10+ Year Member



 
Msg#: 133 posted 5:32 am on Oct 6, 2000 (gmt 0)

>They must have one heck of a redundancy network.

Just did a little more searching and found that their network infrastructure is almost beyond belief.

Off Topic: Global Crossings is providing the network solutions for Google. When I visited their website I found some very interesting information.

[GBLX.NET...]

Exodus Communications has entered a deal to merge with Global Crossings and purchase Global Center. Exodus Communications is providing the network infrastructure for nearly all Inktomi spiders and many search engines that use the Inktomi database.

[webmasterworld.com...]

cirelle

10+ Year Member



 
Msg#: 133 posted 10:11 pm on Oct 22, 2000 (gmt 0)

not for nothing but google is huge. I imagine there are a lot of internal (workstation) ip's here but it seems they could use any one of them for a spider or shadow:

GOOGLE (NETBLK-CW-204-188-0-B)CW-204-188-0-B 204.188.0.0 - 204.188.0.255
208.50.99.128 - 208.50.99.143
64.208.32.0 - 64.208.39.255
64.209.200.0 - 64.209.207.255
64.209.228.16 - 64.209.228.23
64.209.228.32 - 64.209.228.39
64.209.231.56 - 64.209.231.63
64.209.231.48 - 64.209.231.55
63.83.186.64 - 63.83.186.71

drbill

10+ Year Member



 
Msg#: 133 posted 3:17 pm on Oct 26, 2000 (gmt 0)

Littleman It looks like Google is using "round robin" DNS for those spiders.

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved