Welcome to WebmasterWorld Guest from 184.73.126.70

Forum Moderators: Ocean10000 & incrediBILL & phranque

Message Too Old, No Replies

bot hit RSS every 15 minutes

How I block Windows 98 bot

   
12:06 am on May 20, 2013 (gmt 0)



I have this bot visiting my site since months it use opera proxy multiple IP's Hostname: z07-13.opera-mini.net so i can't only block it IP .
I already Block it using this code on .htaccess
SetEnvIf user-agent "^Windows 98" ban #Bad bot
Order allow,deny
allow from all
deny from env=ban

but after I change my host the bot back again .
I contact their support but the gave me th same code I already had .
SetEnvIf User-Agent Mozilla/4.0 (Windows 98; US) Opera 10.00 [en] GoAway=1
Order allow,deny
Allow from all
deny from env=GoAway

This effect none to this annoying bot .
I already test this code
RewriteCond %{HTTP_USER_AGENT} "^windows 98"
RewriteRule ^.* - [F,L]

This work On me when I change My browser UA Using Firefox User-Agent Addon . but it didn't work on the bot .and I don't know why

The bot become worst and grabbing contents from my site now .
12:41 am on May 20, 2013 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Here is the correct code for all three options. The second is the most precise. The first and last will ban any user-agent with Windows 98.

SetEnvIf User-Agent Windows\s98 ban
Order allow,deny
allow from all
deny from env=ban


SetEnvIf User-Agent "^Mozilla/4\.0 \(Windows 98; US\) Opera 10\.00 \[en\]$" GoAway
Order allow,deny
allow from all
deny from env=GoAway


RewriteCond %{HTTP_USER_AGENT} Windows\s98
RewriteRule .* - [F]
2:25 am on May 20, 2013 (gmt 0)

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



SetEnvIf User-Agent Mozilla/4.0 (Windows 98; US) Opera 10.00 [en] GoAway=1

Ouch. I hope that wasn't a literal quote from htaccess. What mod_setenvif thinks this means is:

If the user-agent is "Mozilla/4.0"
then set this list of environmental variables:
(Windows
98;
US)
Opera
10.00
[en]
and finally
GoAway, which gets the specific value of 1. Frankly you're lucky this didn't result in locking out all UAs containing the string "Mozilla/4.0" (robots plus all but the latest versions of MSIE). For that matter, maybe you did lock them out, you just didn't notice :)

Remember that in Apache, a literal space very often has semantic meaning. So it needs to be either escaped or hidden inside quotation marks. Some mods let you go either way; some are more particular. In mod_setenvif, quotation marks are enough.

Incidentally, there is a useful shorthand in mod_setenvif. When matching against the user-agent, you can say
BrowserMatch
or
BrowserMatchNoCase

Edit:
^windows 98

With the opening anchor, this rule-- in any module-- will only work on user-agents that begin "windows 98". With lower-case w, because Regular Expressions are case sensitive unless you've particularly told them not to be.
2:12 pm on May 21, 2013 (gmt 0)



I know this is a dumb question, considering you've already decided on a solution. But why block the bot? The bot is reading your rss feed, and aggregating your content for you, and puts links to your content in more places than it would be normally. This helps you. Why would you want to stop it?
8:02 pm on May 21, 2013 (gmt 0)

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



Best guess: because it isn't putting links anywhere, it's just scraping the content for other sites' benefit.
 

Featured Threads

Hot Threads This Week

Hot Threads This Month