Forum Moderators: open
please check my list
and see if some are not search engines or please add if UA's of good search engines are missing ...
My list is thill now:
ArchitextSpider
Ask Jeeves/Teoma
BackRub/2.1
Googlebot
Googlebot-Image
Googlebot/1.0
Googlebot/2.1
Googlebot/2.2
Gulliver
Gulliver/1.3
InfoSeek Robot 1.0
InfoSeek Sidewinder/0.9
Lycos_Spider
Lycos_Spider_(T-Rex)
Mercator-2.0
Mozilla/3.01 (hotwired-test/0.1)
Mozilla/4.0
Northern Light Gulliver
Scooter
Scooter/1.0
Scooter/2.0 G.R.A.B. V1.1.0
Scooter/2.0 G.R.A.B. X2.0
Slurp
smallbear
T-rex
Ultraseek
vscooter/2.0 G.R.A.B. V1.1.0
The biggest problem with that list is that you have a few very old UA's in that list (when was the last time you saw GoogleBot/1.0?) and that you don't have full user-agent strings...
Partial user-agents are a problem because the moment you try to match them to a real user-agent you are going to suffer from inaccuracies - for example a partial match would assume that a user-agent which includes the word "slurp" are the Inktomi crawler.
The nature of UA's means that a partial matching strategy is bound to hit issues sooner or later...
If it was me I'd start from scratch by trawling through my own logfiles and building a list of the UAs included there and then working outwards from this point.
Missing stuff
The most obvious missing entries are Fast/AllTheWeb and WiseNut (Zyborg)
- Tony