Forum Moderators: phranque

Message Too Old, No Replies

BrowserMatch "perl" ... BrowserMatch "php" - Good idea?

Block out a bunch of junk

         

Thanasus

3:46 pm on Dec 10, 2003 (gmt 0)

10+ Year Member



Looking through my logs for the night I noticed a few different user agents containing "perl" as the substring. Well, I am thinking of blocking all requests that have "perl" or "php" as a substring in the user agent. That should not prevent any good bots from crawling. Yeah it won't block homemade scripts where the coder spoofs the UA but it should get quite a bit of off-the-shelf garbage.

Feedback?

jdMorgan

4:16 am on Dec 11, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Be careful with "libwww-perl" -- AltaVista, Inktomi, and IA Archiver use them for site verification purposes (or something).

Jim

Thanasus

12:40 am on Dec 12, 2003 (gmt 0)

10+ Year Member



IA Archiver I already ban by bot name.

However, I was unaware inktomi or AV had secondary crawlers using perl as a substring in the UA. I thought they only had

Mozilla/5.0 (Slurp/cat; slurp@inktomi.com; [inktomi.com...]

and

Scooter (with version number)

Do they have some sort of secondary crawler? I already banned the perl substring and haven't seen a decrease in their visits.

jdMorgan

1:00 am on Dec 12, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



> haven't seen a decrease in their visits.

Do they manually review your site occasionally? If you block them, you certainly might see "a decrease in their visits!" :o

I think it's important - you may not.

Jim

Thanasus

3:11 am on Dec 12, 2003 (gmt 0)

10+ Year Member



Hmmmm, no they don't do any manual review I am aware of. I don't do any PFI with any of them, just cross my fingers and hope to be picked up. I've got slurp typically on one of my pages every 2 minutes or so nonstop