Forum Moderators: open

Message Too Old, No Replies

Possible AOL spider detected

Looking for confirmation...

         

Key_Master

8:11 am on Sep 8, 2001 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Hit an ODP listed page and left. This AOL spider is unique in that it does not leave a browser agent nor does it index pictures (straight HTML only).

awoyo

2:52 pm on Sep 8, 2001 (gmt 0)

10+ Year Member



I don't think AOL spiders, per se, rather it uses Inktomi to do it's spidering. You don't have a UA but do you have an IP address?

Key_Master

5:42 pm on Sep 8, 2001 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



AOL definitely uses their own spiders for the ODP sites in their directory. The spider will only hit ODP listed URLS (once monthly, give or take a few days). I probably would have never caught it if it weren't for BanBot.

I don't think it will be of much use but the IP is 205.188.208.232 (AOL proxy).

Key_Master

5:05 pm on Sep 15, 2001 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I detected another hit from this "spider". Hit the same page with the same ip and did not have a user agent. Assuming it is hitting weekly, I should recieve another hit on Saturday, 9/22/01.

Josk

3:59 pm on Sep 18, 2001 (gmt 0)

10+ Year Member



I'm currently listing the following as possible spiders...:

205.188.209.37
205.188.209.101
205.188.209.5
205.188.208.5
205.188.208.101
205.188.209.201
205.188.195.58

Anyone else seen evidence of being spidered by these critters?

volatilegx

4:59 pm on Sep 18, 2001 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Josk are the IPs you listed suspected of being AOL spiders? Do the have UAs?

Josk

8:18 am on Sep 19, 2001 (gmt 0)

10+ Year Member



These are "suspected" ips, but the user-agents are for browsers. I've seen spider like activity for these IPS.

However, I'm pretty wary of adding more than these as I realise aol users come through proxy servers...

Brett_Tabke

8:49 am on Sep 19, 2001 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



KM, I agree with your conclusion about those ips/agents. We've seen them wandering around for quite awhile now (late spring). They are the ones used for indexing the ODP sites.

volatilegx

4:55 pm on Sep 19, 2001 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Are there any more IPs from AOL proxy servers spidering ODP listings?

If anybody comes across any more would you please post them here? I will do the same :)

Key_Master

5:44 pm on Sep 22, 2001 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Well it hit again. Same IP with no user agent. It hits my site between the hours of 3 and 4 am each Saturday. Now to figure out why it only hits one ODP listed page (the home page).

volatilegx

10:29 pm on Sep 25, 2001 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I had these come visiting:

UA: "Mozilla/4.0 (compatible; MSIE 5.0; Windows 98; DigExt)"
IP: 172.139.56.228
Hostname: AC8B38E4.ipt.aol.com

UA: Mozilla/4.0 (compatible; MSIE 5.0; Windows 98; DigExt)"
IP: 172.138.46.15
Hostname: AC8A2E0F.ipt.aol.com

grabbed robots.txt

Luke

1:20 am on Oct 24, 2001 (gmt 0)

10+ Year Member



I have recently beeen spidered by AOL a couple of times. It came this morning around 7 am EST, around 9pm EST, and again around 11 pm EST. Here is the info that it gave me:

IP Addresses: 205.188.199.161, 64.12.101.174
Hosts: spider-wm031.proxy.aol.com, spider-mtc-ti054.proxy.aol.com
It was also using MSIE 5.0 once and 6.0 another time.

I was only added to the ODP about 3 weeks ago. I am glad to see it spidered relatively quickly.

volatilegx

3:53 pm on Oct 24, 2001 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I have had reports from fellow log analyzers that the IP addreses listed in this thread were also used by surfers, evidenced by referring URLs in the log listings.

JeKarr

7:18 pm on Oct 29, 2001 (gmt 0)



Im working on a list of AOL spiders, so far this is what Ive come up with:

152.163.201.48: spider-tq013.proxy.aol.com
64.12.102.167 : spider-mtc-tg042.proxy.aol.com
152.163.205.72 : spider-ta062.proxy.aol.com
152.163.195.189 :spider-te034.proxy.aol.com
64.12.104.58 :spider-mtc-tb083.proxy.aol.com
205.188.193.152 :spider-wd012.proxy.aol.com
152.163.207.204 : spider-tl064.proxy.aol.com
152.163.207.204 :spider-tl064.proxy.aol.com
205.188.192.38 :spider-wa043.proxy.aol.com
64.12.103.162: spider-mtc-te032.proxy.aol.com
205.188.196.26: spider-wg021.proxy.aol.com
152.163.205.78: spider-ta073.proxy.aol.com
198.81.16.188: spider-ntc-tb083.proxy.aol.com
152.163.197.76: spider-tm071.proxy.aol.com
64.12.107.171 :spider-mtc-tl051.proxy.aol.com
205.188.199.162 :spider-wm032.proxy.aol.com
64.12.107.156 :spider-mtc-tl021.proxy.aol.com
152.163.201.183: spider-tr023.proxy.aol.com
64.12.102.44 :spider-mtc-th054.proxy.aol.com
152.163.197.67 :spider-tm052.proxy.aol.com
64.12.102.168 :spider-mtc-tg043.proxy.aol.com
152.163.213.203 :spider-tj063.proxy.aol.com
152.163.205.77 :spider-ta072.proxy.aol.com
64.12.102.157 :spider-mtc-tg022.proxy.aol.com

Im looking for more, if anyone has a bunch they could add I would be greatly in your debt.

Thank you
Gary

guysmy

6:14 pm on Nov 7, 2001 (gmt 0)



These spiders have been hitting my page:

inktomi2-cam.server.ntl.com
spider-loh-a073.proxy.aol.com
wallaroo.looksmart.com

malariah

10:52 pm on Nov 18, 2001 (gmt 0)



These spiders have gone to my site:

64.12.103.27 spider-mtc-tf022.proxy.aol.com

195.93.65.179 spider-fra-tb064.proxy.aol.com

Friday

9:24 am on Feb 22, 2002 (gmt 0)

10+ Year Member



I've read that AOL dos send out spiders to help rank it's listings from ODP.

The danger in cloaking for these AOL spiders is that if you cloak to a caching spider that's collecting pages to serve up on its proxy server, then all AOL members will see your cloaked pages instead of the real thing!

I think those asking for robots.txt would be safe to "feed".

Thoughts?

Josk

9:59 am on Feb 22, 2002 (gmt 0)

10+ Year Member



hmmm...recently we got stung a bit becuase aol started sending users out on ips we thought were for spiders. But we noticed quite quickly and switched it off...

At the moment we are being very careful...

volatilegx

6:43 pm on Feb 25, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Good call Josk... I did the same thing a while back. If you are cloaking it may be dangerous to have these IPs in your lists. It also looks like UA cloaking as a backup would be unreliable as well, because sometimes I've noticed that the spider uses a browser type UA.

Olaf

8:18 am on May 3, 2002 (gmt 0)

10+ Year Member



This one hit me tonight pretty hard. Not sure if its an AOL spider or a user with a Site Grabber.

I did not request my robots.txt and for a few hours it grabbed 37.000 pages.

It never gave referer information or a browser agent.

172.143.244.29 / AC8FF41D.ipt.aol.com

Olaf

mbauser2

9:39 pm on May 3, 2002 (gmt 0)

10+ Year Member



This is probably off base, but I thought I'd throw it out there: In a story at The Register, there's an offhand remark about AOL and Inktomi:

The company said that AOL continues to be an important customer of its content networking business - the ISP has Inktomi caches deployed all over its US network.

from [theregister.co.uk...]

So now we know Inktomi is caching stuff on AOL computers; maybe the mysterious AOL spiders are part of cache updating/verifying?

Olaf

1:13 am on May 4, 2002 (gmt 0)

10+ Year Member



Hi all,

When that crawler was up to 60.000 pages in just few hours I decided to send a post to AOL's "report abusive user" interface.

I told them that since that crawler was definitely ignoring all codes of conduct regarding crawlers it couldn't be their caching mechanism <grin>.

It gave no user agent or referrer information and it totally ignored my robots.txt exclusions (didn't even fetch it considering my logfiles)

About 20 minutes after sending the post, it stopped. Totally stopped and I haven't seen it since.

I got no reply from AOL (I explicitly asked for a reply/confirmation so that I could take ip banning measures)

Suspicious if you ask me :o

Olaf