homepage Welcome to WebmasterWorld Guest from 174.129.76.87
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
Forum Library, Charter, Moderators: Ocean10000 & incrediBILL

Search Engine Spider and User Agent Identification Forum

    
User Agent "-"
missing User Agent
dupres01



 
Msg#: 4445845 posted 7:41 am on Apr 26, 2012 (gmt 0)

I have started to see some activity; accesses that do not have a UA. For example,
91.121.132.213 - - [26/Apr/2012:00:07:03 -0600] "GET / HTTP/1.0" 200 6727 "-" "-"
24.178.44.245 - - [26/Apr/2012:00:49:09 -0600] "HEAD /reviews/ HTTP/1.1" 200 - "-" "Mozilla/4.5 [en] (Win98; I)"
81.173.157.242 - - [25/Apr/2012:13:18:40 -0600] "HEAD / HTTP/1.0" 200 - "-" "-"
218.26.48.10 - - [25/Apr/2012:06:18:57 -0600] "HEAD / HTTP/1.0" 200 - "-" "-"


How do I block these? They come from a variety of IPs and the ď-ď stuff isnít even consistent. Is there a way to block such accesses, short of blocking each IP as I notice it?

 

wilderness

WebmasterWorld Senior Member wilderness us a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



 
Msg#: 4445845 posted 3:27 pm on Apr 26, 2012 (gmt 0)

"blank user-agent" [google.com]

motorhaven

10+ Year Member



 
Msg#: 4445845 posted 4:49 pm on Apr 26, 2012 (gmt 0)

I apply an additional header test to these types of fetches. I've seen the satellite providers occasionally using blank UA and referrer when fetching/caching images. The other header test catches things like you posted 99.99% of the time and lets the legit cache fetches through.

Kendo

5+ Year Member



 
Msg#: 4445845 posted 5:24 pm on Apr 26, 2012 (gmt 0)

Do you want to block search spiders, or just the ones that don't provide a moniker?

A cookie check will trap most spiders... write a cookie onLoad and then try to read it... if no cookie then redirect. Lack of JavaScript can have the same effect... using a NoScript statement that includes a redirect.

lucy24

WebmasterWorld Senior Member lucy24 us a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



 
Msg#: 4445845 posted 10:38 pm on Apr 26, 2012 (gmt 0)

Lotsa humans around here block cookies and/or javascript by default. Higher level of knowledge and/or higher level of paranoia than the average user in the street.

Kendo

5+ Year Member



 
Msg#: 4445845 posted 4:40 pm on May 9, 2012 (gmt 0)

Must be paranoia and ignorance. Flash and video usually uses JavaScript to detect browser. No browser detection you don't see the show. Nor do you see page layout optimised for screen or windows size.

motorhaven

10+ Year Member



 
Msg#: 4445845 posted 5:49 pm on May 9, 2012 (gmt 0)

It doesn't matter what the reason the user has, if you block users because they do not enable cookies you're doing yourself a disservice. About 10% of users do not enable cookies.

dstiles

WebmasterWorld Senior Member dstiles us a WebmasterWorld Top Contributor of All Time 5+ Year Member



 
Msg#: 4445845 posted 8:01 pm on May 9, 2012 (gmt 0)

Kendo - cookies are no guarantee. The technique may work with "real" bots but a LOT of scrapers maintain cookies - which is especially useful for dealing with IP-hoppers. :)

In addition I (for example) refuse cookies on most sites I visit. Ditto JS. As for Flash - I almost never view that either.

There is also a UK law now that says that web sites must offer an option to not accept cookies. This is very badly thought out (eg no mention of how US etc sites can be trapped) but it's bureaucrats so you can't really expect much logical thought on the matter. Nevertheless, a few web sites in the UK will implement either an option or dump cookies completely where they are of little use.

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved