homepage Welcome to WebmasterWorld Guest from
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
Forum Library, Charter, Moderators: Ocean10000 & incrediBILL

Search Engine Spider and User Agent Identification Forum

User Agent "-"
missing User Agent

 7:41 am on Apr 26, 2012 (gmt 0)

I have started to see some activity; accesses that do not have a UA. For example, - - [26/Apr/2012:00:07:03 -0600] "GET / HTTP/1.0" 200 6727 "-" "-" - - [26/Apr/2012:00:49:09 -0600] "HEAD /reviews/ HTTP/1.1" 200 - "-" "Mozilla/4.5 [en] (Win98; I)" - - [25/Apr/2012:13:18:40 -0600] "HEAD / HTTP/1.0" 200 - "-" "-" - - [25/Apr/2012:06:18:57 -0600] "HEAD / HTTP/1.0" 200 - "-" "-"

How do I block these? They come from a variety of IPs and the ď-ď stuff isnít even consistent. Is there a way to block such accesses, short of blocking each IP as I notice it?



 3:27 pm on Apr 26, 2012 (gmt 0)

"blank user-agent" [google.com]


 4:49 pm on Apr 26, 2012 (gmt 0)

I apply an additional header test to these types of fetches. I've seen the satellite providers occasionally using blank UA and referrer when fetching/caching images. The other header test catches things like you posted 99.99% of the time and lets the legit cache fetches through.


 5:24 pm on Apr 26, 2012 (gmt 0)

Do you want to block search spiders, or just the ones that don't provide a moniker?

A cookie check will trap most spiders... write a cookie onLoad and then try to read it... if no cookie then redirect. Lack of JavaScript can have the same effect... using a NoScript statement that includes a redirect.


 10:38 pm on Apr 26, 2012 (gmt 0)

Lotsa humans around here block cookies and/or javascript by default. Higher level of knowledge and/or higher level of paranoia than the average user in the street.


 4:40 pm on May 9, 2012 (gmt 0)

Must be paranoia and ignorance. Flash and video usually uses JavaScript to detect browser. No browser detection you don't see the show. Nor do you see page layout optimised for screen or windows size.


 5:49 pm on May 9, 2012 (gmt 0)

It doesn't matter what the reason the user has, if you block users because they do not enable cookies you're doing yourself a disservice. About 10% of users do not enable cookies.


 8:01 pm on May 9, 2012 (gmt 0)

Kendo - cookies are no guarantee. The technique may work with "real" bots but a LOT of scrapers maintain cookies - which is especially useful for dealing with IP-hoppers. :)

In addition I (for example) refuse cookies on most sites I visit. Ditto JS. As for Flash - I almost never view that either.

There is also a UK law now that says that web sites must offer an option to not accept cookies. This is very badly thought out (eg no mention of how US etc sites can be trapped) but it's bureaucrats so you can't really expect much logical thought on the matter. Nevertheless, a few web sites in the UK will implement either an option or dump cookies completely where they are of little use.

Global Options:
 top home search open messages active posts  

Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved