Forum Moderators: open
The link does not provide any info on the bots activity, nor does it offer an example of robots.txt exclusion.
bad idea?
Each webmaster needs to decide what is beneficial or detrimental to their own website (s).
denying ".edu" for me a bad idea, as I have far too many inquiries and communications with educational archive departments.
Not to mention some primary edu centers that provide links to my pages.
I do however have some 3rd party research centers denied access.
Some good arhive reading is the " keebler cookie" company, which in my own 2003 instance was actually FoMoCo.
A few weeks ago, I visited the Benson Ford Research Center and returned "tit-for-tat" ;)
Don
Hardly seems worth the trouble, eh?
What I do block, after allowing the whitelisted bots access, is anything with "http:" in the user agent which nails just about every bot on the planet with a path to their site embedded plus a few odd browser plug-ins which could be whitelisted to avoid blocking but their advertising in my web logs annoys me.
The browser plug-ins only account for maybe 20-30 visitors a day out of about 20K visitors, but it actually caused so much tech support hassles for one browser plug-in that they revised their code to remove that http: path in the UA string.
Who said a single website can't make a difference? LOL
Good. Would you do me a favor, then? Make some tech-support trouble for those plugins that include their long CLASSID number as well. :)
I still haven't figured out what these suckers are:
Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; {C10F4731-13CF-17A6-BD0D-2DFED03246AE}; .NET CLR (etc.)
(I should add that Yangbo looks like a pleasant-faced Doctoral candidate who just didn't read up on robots.txt while working on his data-mining projects.)
Jim
Make some tech-support trouble for those plugins that include their long CLASSID number as well.
I get a ton of those long CLASSID's so that wouldn't be good for business.
I figured only annoying 20-30 visitors a day was an acceptable collateral damage compared to the 100s of bots it stops for the same reason.