homepage Welcome to WebmasterWorld Guest from
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Home / Forums Index / WebmasterWorld / Webmaster General
Forum Library, Charter, Moderators: phranque

Webmaster General Forum

Block amazonaws.com with .htaccess?
Can't block 'em

 7:23 pm on Feb 13, 2009 (gmt 0)

I've tried the following but I can't seem to block amazonaws.com visitors to my sites. This is in .htaccess in my root on a hosted server public_html:

# Options +FollowSymlinks
RewriteCond %{HTTP_REFERER} amazonaws\.com [NC]
RewriteRule .* - [F]

order allow, deny
deny from .amazonaws.com

Any other ideas? It seems that scammers and such are using the Amazon cloud for cheap power to steal sites, etc.

- jim


The Contractor

 7:40 pm on Feb 13, 2009 (gmt 0)

Looking at my blocked log they are coming in using the following UA:


I have all the blocked blocked via .htaccess anyways.


 9:34 pm on Feb 13, 2009 (gmt 0)

Hmmm, don't know what UA is or what to do with those :-)

The Contractor

 1:57 am on Feb 14, 2009 (gmt 0)

RewriteEngine On

RewriteCond %{HTTP_REFERER} amazonaws\.com [OR]
RewriteCond %{HTTP_USER_AGENT} "AISearchBot" [NC,OR]
RewriteCond %{HTTP_USER_AGENT} "woriobot" [NC,OR]
RewriteCond %{HTTP_USER_AGENT} "heritrix" [NC,OR]
RewriteCond %{HTTP_USER_AGENT} "NetSeer" [NC,OR]
RewriteCond %{HTTP_USER_AGENT} "Nutch" [NC,OR]
RewriteRule ^.*$ - [F]


 3:26 am on Feb 14, 2009 (gmt 0)

Ahh, User Agent = UA. I'm studying this stuff but it is far from coming together for me. Still I understand what you've given me and I pasted it into my .htaccess. THANKS! I'll get back to you how it worked in a couple of days.

- jim


 6:32 pm on Feb 17, 2009 (gmt 0)

I put the .htaccess file in the main root of the hosted service, in /www, and in /www/mysite and no results. I'm working with a programmer but he seems a bit light on the subject.

The Contractor

 5:11 pm on Feb 18, 2009 (gmt 0)

Make sure your hosting company allows/supports .htacces and allows overides from httpd.conf file


 5:21 pm on Feb 18, 2009 (gmt 0)

I started wondering that this morning because we are getting nowhere. Thanks for the comment because that helps me explain it to them.

The Contractor

 5:31 pm on Feb 18, 2009 (gmt 0)

Easy way to test if your blocking by User Agent is working. Use FireFox and install the plugin "User Agent Switcher" add one of the UA strings you are trying to block and visit your site.


 6:35 pm on Feb 19, 2009 (gmt 0)

My programmer tried your suggestion. It seems that the amazonaws apps are faking browsers. I'm really clueless but we can't seem to stop them. He found amazonaws IP's so we are trying that now.

Maybe 30% of my page views are now from sites like ec-2-174-129-115-45.compute-1.amazonaws.com and there are dozens of these addresses in the log.

- jim


 7:19 pm on Feb 21, 2009 (gmt 0)

This code finally stopped the amazonaws.com accesses to our site:

RewriteEngine On
RewriteCond %{HTTP_REFERER} ^http://.*amazonaws\.com [OR]
RewriteCond %{REMOTE_HOST} ^.*\.compute-1\.amazonaws\.com$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} "AISearchBot" [NC,OR]
RewriteCond %{HTTP_USER_AGENT} "woriobot" [NC,OR]
RewriteCond %{HTTP_USER_AGENT} "heritrix" [NC,OR]
RewriteCond %{HTTP_USER_AGENT} "NetSeer" [NC,OR]
RewriteCond %{HTTP_USER_AGENT} "Nutch" [NC]
RewriteRule ^(.*)$ - [F]

Thanks for your help The Contractor! This thread should be useful for others encountering this issue, and all sites probably will eventually. Scammers love the cheap power of cloud computing.

- jim


 10:43 pm on Feb 21, 2009 (gmt 0)


You're off to a good start with those, and you may also want to check the Search Engine Spider Identification [webmasterworld.com] forum for a lot more on AmazonAWS

Global Options:
 top home search open messages active posts  

Home / Forums Index / WebmasterWorld / Webmaster General
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved