Forum Moderators: phranque

Message Too Old, No Replies

Aol

... and shifting IP addresses

         

jackson

3:50 am on Aug 12, 2004 (gmt 0)

10+ Year Member



I'm beginning to wonder about AOL. From my logs I have this:
205.188.116.74 - - <snipped log details>
205.188.116.133 - - <snipped log details>
205.188.114.16 - - <snipped log details>
205.188.116.72 - - <snipped log details>

I gather this pattern is typical of an AOL user. What I find disconcerting is:
a) the amount of data transferred each time they hit my site. Leaves me wondering is they've "leeched" or "sucked down" the whole site. I have anti-image linking in my htaccess and am starting to toy with some of the anti-leeching options discussed in this forum.

b) had this come up the other day:

205.188.116.208:Yahoo-MMCrawler/3.x :Macintosh; U; PPC Mac OS X Mach-O; en-US; rv:1.4

The IP address is an AOL address. The crawler went into folders disallowed in my robots.txt. This together with the Mac stuff leads me to think that this might be a "spoofed" header.

Any opinions on what could be done here? I mean AOL ought to be a useful but this sort of "maverick" behavior leaves me wondering - as in whether to ban this whole IP address range or what?

jdMorgan

4:01 am on Aug 12, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



jackson,

Looks like a spoof to me... MMCrawler is not hosted on a Mac, for one thing. Questions of this nature should generally be posted in our Search Engine spider Identification or Tracking and Logging forums, BTW.

I'd suggest you search WebmasterWorld [google.com] for Key_Master's bad-bot ban script and its derivatives. This is a script, now available in PHP as well as the original PERL, that you can use to trap harvesters and site scrapers by their behaviour. It involves placing 'bait' links in your pages, and disallowing the pages those links refer to in robots.txt. Any IP address that fetches one of those bait pages is added to a deny record in your .htaccess file.

Jim

jackson

5:42 am on Aug 12, 2004 (gmt 0)

10+ Year Member



Jim,

Thanks, I'll get digging.