homepage Welcome to WebmasterWorld Guest from
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Home / Forums Index / WebmasterWorld / Ecommerce
Forum Library, Charter, Moderators: buckworks

Ecommerce Forum

Malicious Bots Crawling Site?

5+ Year Member

Msg#: 4360510 posted 2:02 pm on Sep 9, 2011 (gmt 0)


Usually when I'm sleeping and can't monitor the server it seems like my RAM gets exhausted and makes the website not respond. I think this could be from malicious bot crawling my site which then makes RAM run at 99% and crashes my server.

Has anyone ever experienced an issue like this with malicious bots crawling their site? Is there any logs files I can take a look at to see which bots/ips are actually causing this issue?

Thank you,




WebmasterWorld Senior Member 10+ Year Member

Msg#: 4360510 posted 5:47 pm on Sep 9, 2011 (gmt 0)

Does your server not produce what is called "raw log files" which record every file taken from your site, at what time, by which IP and with what user agent ?

Those would be the first to look at.


WebmasterWorld Senior Member 10+ Year Member

Msg#: 4360510 posted 10:45 pm on Sep 9, 2011 (gmt 0)

It might be malicious bots, which you could check, as Staffa says, in your raw log stats or even in Awstats Access Details Report in your stats page, if you're using Astats. I see all kinds of bots in mine.

But if your RAM is being used up, then I would say you do not have enough RAM for your site and need to either upgrade or made sure that your server software is not running amok and using up a lot of it. That happened to me recently. The other thing is if you are on a shared server, it could be that one of the other sites you're sharing with is doing stupid stuff at night and using up all the RAM.


5+ Year Member

Msg#: 4360510 posted 11:01 pm on Sep 9, 2011 (gmt 0)

I noticed a large amount of crawling coming from FatBot (TheFind); I ended up blocking their IP range so hopefully this helps the RAM from being exhausted.

I'm on a dedicated server with 12GB RAM, so I think I have enough RAM for this server. The site has been running fine for a while now; so hopefully it was the FatBot and the site should be functioning better now.

I'll also go through the "raw access logs" to see if anything else jumps out at me as strange.


WebmasterWorld Senior Member lucy24 us a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month

Msg#: 4360510 posted 1:14 am on Sep 10, 2011 (gmt 0)

If malicious bots are not crawling your site, I strongly recommend that you drop whatever business you are in and go into consulting or lecturing, because I can think of a few thousand people who would pay good money to learn your secret.

But, uhm, robots don't exactly keep to schedules or time zones. They're machines. They don't know when you're asleep. Cue Twilight Zone theme...

Now, if most of your legitimate traffic comes in during a particular time of day, and this happens to coincide with the time you're asleep... Well, for that you don't even need raw logs. Your basic built-in analog stats will tell you that much.


5+ Year Member

Msg#: 4360510 posted 7:08 pm on Sep 10, 2011 (gmt 0)

If your server software is Apache, see the announcement here (and follow the link to the CVE) at [httpd.apache.org...] about the August 2011 release of Apache 2.2.20, which fixes a bug in prior Apache versions. A maliciously crafted request using the "byte range" feature can consume all memory and crash a server.

A web search on
apache killer byte range
will turn up numerous articles about it.


WebmasterWorld Senior Member 10+ Year Member

Msg#: 4360510 posted 7:15 am on Sep 12, 2011 (gmt 0)

If you are seeing nightly spikes in your stats this is not necesarilly due to bots, it could also be cronjobs running on your server at night. So check the raw log files if it is really traffic. It could also be the script evaluating your logfiles and creating your statistics.


Msg#: 4360510 posted 6:37 am on Oct 16, 2011 (gmt 0)

yes, we did experience this issue when we were using Magemto.
We have now switched to a very sophisticated cart that includes a Bot detector that not only detect them(using a complex heuristic) but also take them out of the site statistics and also tell you exactly when and how much pages were visited by that BOT!
Then using the useragent information detected you can insert a simple rule in your .htaccess file to reroute that BOT to a static page.



5+ Year Member

Msg#: 4360510 posted 11:49 am on Nov 7, 2011 (gmt 0)

No, never had it happen with Magento but if you are not running enough RAM it can be an issue. 128GB is not enough for a web server, Apache or IIS.

The find is not a malicious bot but rather a shopping comparison site that creates free links to your site. Block it and you lose those free inbound links. Cheaper to add RAM using the money you saved on building inbound links not blocking the bots.


WebmasterWorld Senior Member topr8 us a WebmasterWorld Top Contributor of All Time 10+ Year Member

Msg#: 4360510 posted 12:02 pm on Nov 7, 2011 (gmt 0)

>>>128GB is not enough for a web server, Apache or IIS.

i disagree, that would be an astonishingly large amount of RAM, however given that you made a typo and meant 12GB, that is also a large amount of RAM for a dedicated server.

THE FIND is an affiliate and coupon site in disguise, i don't think i've ever had a referal from it.


WebmasterWorld Administrator incredibill us a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

Msg#: 4360510 posted 12:24 pm on Nov 7, 2011 (gmt 0)

Heck, I run on 2GB with a huge database and you can't bring my server down without trying real hard, of course it's Linux/Apache, none of that bloated Windows stuff ;)

BTW, depending on the ecommerce you're running, I've seen misconfigured catalogs cause the software to go looping into cyberspace and bring the server down. Usually stupid stuff like a parent/child category chain that ends up in a loop somewhere so it runs forever. You might want to scan your logs for "500" errors, which will probably happen to all the scripts when you kill them when it's overloaded like that.

FYI, how I figure out who/what overloaded the server is by simply killing all running web tasks, not Apache itself, and then check the log files to see what gave server 500 response codes when I killed them. You'll see the user agent hammering the site if that's the problem, or you'll see the script that's going off the deep end if that happens to be the issue.


5+ Year Member

Msg#: 4360510 posted 12:33 pm on Nov 7, 2011 (gmt 0)

@topr8 Strange I get 10-12 visitors a day from The Find. Not breaking any records but 20 inbound links. They build their index from many sites and as pointed out are an affiliate and coupon site, so what? Give me a hundred more sites like that.

Blocking traffic is a funny way of dealing with it. Just my opinion. Block away :)

*edit cause I mis-read the memory size.


WebmasterWorld Administrator 5+ Year Member Top Contributors Of The Month

Msg#: 4360510 posted 12:08 am on Nov 20, 2011 (gmt 0)

I have also blocked thefind and visitors coming from thefind. I do not normally block visitors from any source, but FatBot/thefind is greedy and malicious. They have been sending over 100 visits (not hits) a day from which I gain zero. They click out on my links but I am never credited. Within hours of blocking them I got my first commission from the main directory where they have been wasting my resources. These are not plain inbound links, they are links from content scraped from my site. They are still sending visitors but visit length is 0 seconds now. It is a drastic measure for a drastic situation.

Global Options:
 top home search open messages active posts  

Home / Forums Index / WebmasterWorld / Ecommerce
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved