homepage Welcome to WebmasterWorld Guest from 54.145.182.50
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
Forum Library, Charter, Moderators: Ocean10000 & incrediBILL

Search Engine Spider and User Agent Identification Forum

    
Mozilla/4.0 (compatible; MSIE 6.0; Windows 98)
SEOPTI

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 3448831 posted 7:11 pm on Sep 12, 2007 (gmt 0)

I have banned this one, it's some sort of downloader. Can't think of legitimate people still using WIN 98.

RewriteCond %{HTTP_USER_AGENT} ^Mozilla/4\.0\ \(compatible\;\ MSIE\ 6\.0\;\ Windows\ 98\)$
RewriteRule .* - [F]

 

volatilegx

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 3448831 posted 1:43 pm on Sep 13, 2007 (gmt 0)

I was over at some elderly folks home just the other day installing their new printer and their desktop still ran Win 98 ;)

wilderness

WebmasterWorld Senior Member wilderness us a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



 
Msg#: 3448831 posted 2:43 pm on Sep 13, 2007 (gmt 0)

Dan,
I still get many visitors using 98. Course my widgets are of intereste to an older generation and/or market.

I believe he's focusing on "ends with 98", which would imply that no updates or other software has been added (at least know to the UA).
Also the possibility that somebody is using a "barebones" Win 98 UA to Harvest?
Just a thought.

Seopti,
In the event your intent was to stop the "ends with"?
Your rewrite (although functional) is overkill.

RewriteCond %{HTTP_USER_AGENT} 98)$
RewriteRule .* - [F]

Please note; parenthenses do not require escaping.

In addition you could even make your rewrite more functional by adding an IP range:

RewriteCond %{HTTP_USER_AGENT} 98)$
RewriteCond %{REMOTE_ADDR} ^123.456.789.123
RewriteRule .* - [F]

Don

SEOPTI

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 3448831 posted 1:39 pm on Sep 19, 2007 (gmt 0)

Thanks wilderness, I need to look into the bot trap, this approach blocked this scam but also blocked visitors.

Eric

5+ Year Member



 
Msg#: 3448831 posted 2:41 pm on Sep 19, 2007 (gmt 0)

:( there are still have people use win98, definitely

My method is made a spider trap, I cannot give sample by the rule here, basicaly it work like this, I use long time now, work well.

1. make a hidden link on every page, for example

<a href="/-domain.com-123456.html">xx</a>

2. you catch all 404 error, and on this 123456 page, you display more links,like

<a href="/-domain.com-/1.html">xx</a>
<a href="/-domain.com-/3.html">xx</a>

and also put noindex nofollow meta on the page, if a decent robot, will stop here
(I also put Disallow on robots.txt)

3. also catch 404 error, if any one claw the page 1.html 3.html etc,
add the IP to a block list and send 403 Too many users error.

4. all you page must check the ip and if the ip in block list, give 403 Too many users error.

----------

I implement this by IIS and asp, very easy and effective, someday can catch lots of spiders. For example, now my block ip list have

65.210.123.nnn
211.189.26.nnn
72.53.194.nnn
206.71.70.nnn
122.203.84.nnn
192.146.7.nnn
200.219.150.nnn
218.48.23.nnn
66.93.229.nnn
204.246.129.nnn
200.245.191.nnn
202.95.142.nnn
196.35.158.nnn
196.36.198.nnn
216.132.90.nnn
87.244.211.nnn
83.138.172.nnn
213.133.167.nnn

I clean the list every day.

-----------

Why send "403 Too many users" error, because if send 404 or 500 error, the spider will know something wrong, and also if your miscatch a good robot, it also has chance to go back.

[edited by: volatilegx at 6:32 pm (utc) on Sep. 19, 2007]
[edit reason] obfuscated ip addresses [/edit]

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved