Forum Moderators: phranque
205.188.116.74 - - <snipped log details>
205.188.116.133 - - <snipped log details>
205.188.114.16 - - <snipped log details>
205.188.116.72 - - <snipped log details>
I gather this pattern is typical of an AOL user. What I find disconcerting is:
a) the amount of data transferred each time they hit my site. Leaves me wondering is they've "leeched" or "sucked down" the whole site. I have anti-image linking in my htaccess and am starting to toy with some of the anti-leeching options discussed in this forum.
b) had this come up the other day:
205.188.116.208:Yahoo-MMCrawler/3.x :Macintosh; U; PPC Mac OS X Mach-O; en-US; rv:1.4
Any opinions on what could be done here? I mean AOL ought to be a useful but this sort of "maverick" behavior leaves me wondering - as in whether to ban this whole IP address range or what?
Looks like a spoof to me... MMCrawler is not hosted on a Mac, for one thing. Questions of this nature should generally be posted in our Search Engine spider Identification or Tracking and Logging forums, BTW.
I'd suggest you search WebmasterWorld [google.com] for Key_Master's bad-bot ban script and its derivatives. This is a script, now available in PHP as well as the original PERL, that you can use to trap harvesters and site scrapers by their behaviour. It involves placing 'bait' links in your pages, and disallowing the pages those links refer to in robots.txt. Any IP address that fetches one of those bait pages is added to a deny record in your .htaccess file.
Jim