homepage Welcome to WebmasterWorld Guest from
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Visit PubCon.com
Home / Forums Index / WebmasterWorld / Website Analytics - Tracking and Logging
Forum Library, Charter, Moderators: Receptional & mademetop

Website Analytics - Tracking and Logging Forum

This 354 message thread spans 12 pages: < < 354 ( 1 2 3 4 [5] 6 7 8 9 10 11 12 > >     
Logs Show Surge, but Not Human?

 9:23 pm on Feb 21, 2012 (gmt 0)

On one site I work with, I've seen traffic go from 10K visits/day to 40K. The additional traffic looks human at first glance - it is captured by Google Analytics, It comes from diverse consumer IPs in the US and Europe (but not Asia), and the bounce rate is high but one out of ten visits or so loads another page.

On the non-human side, we have all of the traffic coming with no referrer, and it is all focused on a few pages that are hardly viral linkbait and would get one or two views on a good day. It's all IE (spread among 6 - 9), and a range of screen resolutions that look unusually aged (e.g., 1024x768).

Anecdotally, I've heard of a few other sites seeing this kind of traffic, but nobody knows what the purpose might be. It's not scraping, as it's the same pages that get hit. It's not intense enough to be an attack to take the site down, nor is the site likely to be the target of miscreants.

The level of traffic has gone up and down, but it's still happening.

Are any of your sites seeing this, and do you have any theories?

Any thoughts on screening this out of Analytics? It totally blows up time period comparisons.



 2:43 am on Mar 6, 2012 (gmt 0)

Because of the bot, or was there other factors?
I've never failed in stopping bots hitting my sites. There must be something unque to them. Did you try my script above?


 1:39 pm on Mar 7, 2012 (gmt 0)

Just wondering if anyone has an update on this here? Has anyone found a permanent solution? Do we know who is behind it?


 3:37 pm on Mar 7, 2012 (gmt 0)

Nope, nope and nope.

I contacted the Gomez people; sent em a bunch of logfiles and screenshots at their request, they escalated it, and came back swearing it wasn't them. And the traffic *didn't* stop.

Also, they tell me that they include Gomez in the User-Agent. Which may or may not be true, but short of signing up for the program and running it myself, I got no reason not to believe them.

Could be another company doing the same thing (gathering benchmarks for performance testing). Could be a test for a bigger rollout/disruption. Could be a Windows virus. Could be anything. But it doesn't stop, and unfortunately, the solution we thought we found wasn't it.

Seb7 - I appreciate the thought, and I hope you don't take offense, but I don't know you, and I'm not a coder myself, and I can't put up anything I am not 110% confidant I know what it does and what effect it will have on my traffic at peak (which is sizable)


 4:13 pm on Mar 7, 2012 (gmt 0)

I'm going to recant my earlier statement. What I thought was the magic bullet turned out to be some web hosting company modifying HTTP headers sent by browsers under the premise of it being required for FastCGI running PHP, which was total nonsense.

In other words, my data set was tainted.

Some things I found still work but are more site specific, not a panacea, nothing just anyone could use to block all the traffic.

Best I can tell from several samples these are either real browsers being used or really good fakery. The only real solution to the problem may be to build a list of the IPs involved in the attack, gathered from multiple sites, and see if they are using any IPs in common and build a block list of all repeat offenders.


 4:36 pm on Mar 7, 2012 (gmt 0)

Wouldn't that kind of be... ginormous?


 4:40 pm on Mar 7, 2012 (gmt 0)

Not necessarily. define ginormous :)

Probably going to run across 10s of thousands of them that are identifiable.


 4:44 pm on Mar 7, 2012 (gmt 0)

What kind of NUMBERS (sec,min,hour,day) are we actually talking about? What speed hit? Is there a time of GMT in common? As incrediBILL mentions, a wider data set is necessary. My six sites have not seen this... yet! Everything bad seems to get to me sooner or later. (sigh)


 4:47 pm on Mar 7, 2012 (gmt 0)

At the peak my homepage was apparently getting 6.000+ visits from this thing. That's been down to about 1,200 for a few days, but it might be inching up again, not sure yet.

I feel bad for people who are really getting hammered by this thing.



 5:10 pm on Mar 7, 2012 (gmt 0)

Mine are going back and forth between 100 visits an hour to 500 visits an hour. It had slowed down some (we've put in a few things to block stuff like non English browsers - the site is pretty tightly targeted to US traffic) but now today it looks like it's picking up again.

For me goes on all hours of the day and night. In the wee small hours where, this time of year, I have NO traffic on this site, it's still going.

It's a slow drip, my real time analytics will usually show anywhere from 1-10 of these bot visits on the site simultaneously. Occasionally as much as five minutes goes by without one, but then they come back in a burst.

They *do* repeat - some of them have come back as many as 6 or 10 times.

I am keeping all my logfiles around (I do anyway) and if you think you are being hit by this, you might do the same. At some point maybe we can find a way to compare notes, or see if we're all being hit by the same IPs.


 8:00 pm on Mar 7, 2012 (gmt 0)

Ok folks - I have same problem on one of my sites. I have been trying to track this thing down for about a week.

Here's what I am currently trying. Since this thing does execute java - I downloaded and installed the AXS tracking script from Matt's.

I have four webpages getting pounded by this thing and I'm tracking all the data that AXS gives for these webpages. I'll be looking for a pattern.

Also, I setup an ip ban script not identified within the robots.txt and htaccessed out all know good robots. I'll see what I catch.

I'll share what I learn


 9:57 pm on Mar 7, 2012 (gmt 0)

Update: On these four webpages traffic (page views) is typically as follows:

Page 1) 1500/day
Page 2) 500/day
Page 3) 120/day
page 4) 350/month

Page 4 had 8000+ visits yesterday

I took Adsense ads off all..

Page 4 (low traffic) I put my automatic ip ban script and am banning all visits by ip in my htaccess.

I'm seeing a significant drop in direct traffic


 10:25 pm on Mar 7, 2012 (gmt 0)

@edge (or others with same problem regarding specific pages being routinely hit) have you just changed the filename, modified top down internal links to that page and NOT 301'd? Just accept the 404 with no worries.

Reason I ask, back in 2003 or 2004 I had something like this happen to one of my sites and that's what I did. The search engines found the pages again, ended up in serps, but the bad actors didn't follow the change, ie. they disappeared.

Might be worth trying, after all, the value of these pages is currently kaput. I'd rather have a week or so wait to be rediscovered than to get hammered.

Also, what referers (sic) are attached to the hits? Anything there?


 11:19 pm on Mar 7, 2012 (gmt 0)

Mine is hitting solely the home page.


 11:21 pm on Mar 7, 2012 (gmt 0)

Interesting, these four page of mine that are getting pounded all position #1 - #4 in the serps for very competitive keywords in my vertical.

@ tangor - last thing I want to do is change the filenames as these pages are old with tons of backlinks.

I've collected 523 unique ip's with my trap. They are beginning to recycle more and my direct hits is starting look more normal.


 11:32 pm on Mar 7, 2012 (gmt 0)

My home page is hoarding all these hits for itself, it never has played well wit the rest of the site. :)

[edited by: ken_b at 11:33 pm (utc) on Mar 7, 2012]


 11:33 pm on Mar 7, 2012 (gmt 0)

@ tangor - last thing I want to do is change the filenames as these pages are old with tons of backlinks.

Which begs the question, do the pages getting hammered have common backlinks? Not every link is valuable. (Been there, done that.) Not offered as argument for argument's sake, just suggesting (as I discovered way back when dealing with the same thing) that those pages I mentioned had common back links which were not found elsewhere on my site. Things returned to normal and new backlinks eventually replaced those others... Meanwhile those backlinks highly desired and I knew to be good I sent a little email letting them know the filename had changed. YMMV, but sometimes one just puts out the fire then rebuilds the house.


 11:49 pm on Mar 7, 2012 (gmt 0)

but sometimes one just puts out the fire then rebuilds the house.

Well, these are four pages of 24,000+ website in a tough vertical. Worst case I figure is that I pull the AdSense and the Google Analytics and ride out the storm. Google will not know any different and my servers can easily handle the load.

Eventually, I'll figure this out and shut them down...


 3:03 am on Mar 8, 2012 (gmt 0)

Anyone want to try my java code above?

It evaluates if it is a bot using a scoring method, double checks, then stops the page going further if it is.


 3:27 am on Mar 8, 2012 (gmt 0)

We've been hit by the the same thing. Starting Feb 21st/22nd homepage visits (direct traffic) on one of our sites went from a few hundred to near enough 10,000. There was a drop off over the next week but the figure's still holding steady at around 3,600 extra visits per day.

Glad to have found this thread.


 3:46 am on Mar 8, 2012 (gmt 0)

Status update: Direct and Organic traffic is now back to normal/expected. I have captured and banned 1045 ip addresses so far.

error_log shows that I am issuing a "denied" access to these four files every second or so.

Still catching and banning a new ip address every 50 – 120 seconds.

Looking at the IP addresses most trace back to:

Montevideo, UL via Mexico
South Bribane, AU
Amsterdam, NL
Aspen, Colorado
and a few minor others.

These four pages generate around $300 monthly – I’m going to keep AdSense off until I’m sure I got a handle on this. I’ll be fine…


 1:20 pm on Mar 8, 2012 (gmt 0)

Yea Portugal was the #2 country on mine, after the US.


 1:24 pm on Mar 8, 2012 (gmt 0)

Query: the pages being hit... are they best/highest performers for income? Which raises the question, if so, why hit?


 1:31 pm on Mar 8, 2012 (gmt 0)

I got this bot-net thing corralled and under control. All traffic numbers/balance look normal. I have removed my auto banning script and have resumed normal operations on these affected web pages. I have not put AdSense back on them though.

It took my ban script somewhere between 2 – 4 hours to collect enough ip’s to effectively shut this thing down.

Anybody I recognize that is interested in my auto htaccess ban script or my collection of ip addresses is welcome to them – just sticky me.

I am sure my ip address collection has a few email harvesters, downloader’s and maybe even a legitimate robot – so one might vet the list.


 1:32 pm on Mar 8, 2012 (gmt 0)

Query: the pages being hit... are they best/highest performers for income?

Definitely not in our case.


 2:29 pm on Mar 8, 2012 (gmt 0)

Query: the pages being hit... are they best/highest performers for income?


Mine is a non-ecom site, info only. I sell nothing on the site.


 5:20 pm on Mar 8, 2012 (gmt 0)


I just sent you a sticky, can you please send me the list so I can ban these IP's? This has been hurting me pretty bad since February 21st and I cant seem to stop them.

I emailed my web hosting company about this attack and they acted like nothing was going on and I was delusional.


 6:54 pm on Mar 8, 2012 (gmt 0)

Other than the home page, the pages hit on my site were unexceptional - in fact, the high traffic to those pages was the clue to what was going on. They had previously generated minimal traffic.


 12:57 pm on Mar 9, 2012 (gmt 0)

It is causing me serious issues, it is crashing my site regularly. I need to get these ip addresses blocked.


 1:11 pm on Mar 9, 2012 (gmt 0)

After multiple pings, Compuware contacted me back yesterday and I provided the site URL. No word yet. I'm not anticipating anything like, "Hey, that IS us, sorry, we'll stop."


 1:17 pm on Mar 9, 2012 (gmt 0)

Does anyone have an IP address list that I can add to my htacess to block? Or is Edge the only one who has this? I am waiting for him to reply, however my site has been crashing 50% of the time lately, web hosting company does not seem to care. This is going to shut my site down.

I am looking at my visitor IP's now, but there are so many, and I dont know how to tell which ones to look at as possible bots. Even when I look up the IP's they appear to be legit.


 1:21 pm on Mar 9, 2012 (gmt 0)

Well I tried Edge's list, and it didn't match up with the ones hitting mine.

Roger, I contacted Compuware as well, and they escalated it, but came back and said no, it wasn't them. They say that they identify themselves in their User Agent.

Someone from Google popped in the Google Analytics help forum in Google Groups and said they were looking into it. They have more resources than we do. But don't hold your breath for anything happening right away.

Right now I've it not serving AdSense or analytics codes to anyone who comes in on IE with no referrer, and hoping for the best. All I can do.

This 354 message thread spans 12 pages: < < 354 ( 1 2 3 4 [5] 6 7 8 9 10 11 12 > >
Global Options:
 top home search open messages active posts  

Home / Forums Index / WebmasterWorld / Website Analytics - Tracking and Logging
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved