Forum Moderators: DixonJones

Message Too Old, No Replies

Odd massive simultaneous repeated GET requests

         

anotheradministrator

5:54 am on Mar 11, 2009 (gmt 0)

10+ Year Member



Looking through my Apache logs, I find that occasionally the same host requests the same file a huge number of times with a very short delay. The files are invariably image files linked from most pages of the site, and are very small. On occasion a host will request the same file tens of times a second; a few times I have seen a host request one of these several hundred times in the space of a few seconds. It doesn't look like a DoS attempt, because the requested files are so small, and the user agent must be sending the correct caching headers, since Apache responds to all the requests with a 304. But what causes this? Is there a known bug in some piece of software? It doesn't seem to cause any real damage, since the requests are all so small. The requests come from different hosts, with no referrer specified but usually in between valid-looking page requests, and from the logs the hosts look more like regular users than bots. Recent user agent strings include:

"Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 2.0.50727)"

"Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1)"

"Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; CNCDialer; .NET CLR 2.0.50727)"

Does anyone have any ideas?

Marino

1:14 pm on Mar 13, 2009 (gmt 0)

10+ Year Member



Hhello,

I encoutered the same thing on a favicon and other images. I 403-ed their IPs through .htaccess with a "deny".

It may be someone trying to show out within your stats page. If your stats page is public, you should password-protect it.

Regards,

Marino

anotheradministrator

10:30 am on Mar 14, 2009 (gmt 0)

10+ Year Member



It's not requests for favicon, or for images related to any stats pages, but for small icons which appear on normal HTML pages of the site. Normally they get requested at most once per page view (usually much less since browsers can easily cache them). But occasionally they get requested over and over by a particular host, with the requests apparently coming in simultaneously.

Marino

11:36 pm on Mar 14, 2009 (gmt 0)

10+ Year Member



No, I was not talking about images related to a stat page.

Consider the following. Your stat page gives the list of your most frequent referers. If one request ANY of your images 1,000 times, it may become listed in your stats page as a frequent referer.

If your stats page is public, it will be given automatically a backlink to his site. That's how and why some adult/pharmacy sites may be listed in your stats page.

Got it? ;-)

anotheradministrator

12:04 pm on Mar 16, 2009 (gmt 0)

10+ Year Member



OK, I see what you mean now. But I don't think that's the explanation in this case, because there are no identifying features or links coming with the requests - the referrer is always blank, and the user agent is always a standard one (and doesn't contain a URL), so even if it got onto a stats page it wouldn't show a link or any identifiable information besides the requesting hosts IP address.

jimr451

2:14 pm on Jul 2, 2009 (gmt 0)

10+ Year Member



Hi,

I've seen this as well, just recently. I was wondering if you ever got an inkling as to what caused it?

From my logs, it also looks like a legit browser, but it's downloading image files many times when loading a page. It's almost as if the browser uses no caching at all, or doesn't realize it already has the file. Most of the images are small spacers, used many times throughout the page.

-Jim

jdMorgan

2:31 pm on Jul 2, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I'd suggest looking up the requesting IP addresses at ARIN, RIPE, APNIC, etc., and examining the image files' HTTP response headers in detail using a tool such as the Live HTTP Headers add-on for Firefox and Mozilla-based browsers.

You may be getting these requests from a caching proxy, such as used by AOL, EarthLink, etc. and by many satellite-internet providers. Or your image-file response headers may be marking the files as "must-revalidate" and/or sending them with very short 'expires' times. Or you might be dealing with a broken or misconfigured client or proxy.

Gather all of the detailed information that you can about the requests and their source(s), and the image file request responses from your server. With more information, this problem may start to make sense.

Jim