homepage Welcome to WebmasterWorld Guest from 54.226.80.196
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Home / Forums Index / Code, Content, and Presentation / Apache Web Server
Forum Library, Charter, Moderators: Ocean10000 & incrediBILL & phranque

Apache Web Server Forum

    
bot hit RSS every 15 minutes
How I block Windows 98 bot
SaleF




msg:4575564
 12:06 am on May 20, 2013 (gmt 0)

I have this bot visiting my site since months it use opera proxy multiple IP's Hostname: z07-13.opera-mini.net so i can't only block it IP .
I already Block it using this code on .htaccess
SetEnvIf user-agent "^Windows 98" ban #Bad bot
Order allow,deny
allow from all
deny from env=ban

but after I change my host the bot back again .
I contact their support but the gave me th same code I already had .
SetEnvIf User-Agent Mozilla/4.0 (Windows 98; US) Opera 10.00 [en] GoAway=1
Order allow,deny
Allow from all
deny from env=GoAway

This effect none to this annoying bot .
I already test this code
RewriteCond %{HTTP_USER_AGENT} "^windows 98"
RewriteRule ^.* - [F,L]

This work On me when I change My browser UA Using Firefox User-Agent Addon . but it didn't work on the bot .and I don't know why

The bot become worst and grabbing contents from my site now .

 

Key_Master




msg:4575569
 12:41 am on May 20, 2013 (gmt 0)

Here is the correct code for all three options. The second is the most precise. The first and last will ban any user-agent with Windows 98.

SetEnvIf User-Agent Windows\s98 ban
Order allow,deny
allow from all
deny from env=ban


SetEnvIf User-Agent "^Mozilla/4\.0 \(Windows 98; US\) Opera 10\.00 \[en\]$" GoAway
Order allow,deny
allow from all
deny from env=GoAway


RewriteCond %{HTTP_USER_AGENT} Windows\s98
RewriteRule .* - [F]

lucy24




msg:4575598
 2:25 am on May 20, 2013 (gmt 0)

SetEnvIf User-Agent Mozilla/4.0 (Windows 98; US) Opera 10.00 [en] GoAway=1

Ouch. I hope that wasn't a literal quote from htaccess. What mod_setenvif thinks this means is:

If the user-agent is "Mozilla/4.0"
then set this list of environmental variables:
(Windows
98;
US)
Opera
10.00
[en]
and finally
GoAway, which gets the specific value of 1. Frankly you're lucky this didn't result in locking out all UAs containing the string "Mozilla/4.0" (robots plus all but the latest versions of MSIE). For that matter, maybe you did lock them out, you just didn't notice :)

Remember that in Apache, a literal space very often has semantic meaning. So it needs to be either escaped or hidden inside quotation marks. Some mods let you go either way; some are more particular. In mod_setenvif, quotation marks are enough.

Incidentally, there is a useful shorthand in mod_setenvif. When matching against the user-agent, you can say
BrowserMatch
or
BrowserMatchNoCase

Edit:
^windows 98

With the opening anchor, this rule-- in any module-- will only work on user-agents that begin "windows 98". With lower-case w, because Regular Expressions are case sensitive unless you've particularly told them not to be.

epricity




msg:4576204
 2:12 pm on May 21, 2013 (gmt 0)

I know this is a dumb question, considering you've already decided on a solution. But why block the bot? The bot is reading your rss feed, and aggregating your content for you, and puts links to your content in more places than it would be normally. This helps you. Why would you want to stop it?

lucy24




msg:4576287
 8:02 pm on May 21, 2013 (gmt 0)

Best guess: because it isn't putting links anywhere, it's just scraping the content for other sites' benefit.

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Code, Content, and Presentation / Apache Web Server
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved