Welcome to WebmasterWorld Guest from 54.90.204.233

Forum Moderators: phranque

Message Too Old, No Replies

Trying to understand some weird traffic on a site

     
8:46 am on Dec 20, 2016 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:June 25, 2007
posts:1110
votes: 9


I am by no means a pro log reader, so I have not been able to figure this out... A few months back, I noticed in AWStats, one page on our site started getting ridiculously high hits. Way higher than anything else on the site, by thousands per month. The 'views' are in the thousands, but the entry and exit are just in the hundreds, which I don't understand. What was especially weird, was that the page was just some very boring, basic text... and that page normally ranked at the very bottom. I looked at everything numerous times and didn't see any back-links or clues where these views are coming from. I just sort of gave up and forgot about it. This month, I noticed the mystery traffic completely disappeared. Curious again, I went back and started looking at various things. The only thing I can see that corresponds, is that traffic from China increased in a way each month, that was directly proportionate to the numbers on the page in question. On the month where the traffic on that page was the highest... China traffic was ranked #1. Normally it never would be. Unfortunately I never thought to look at the raw logs during the period where the hits on that page were high. So I'm not sure what more I can do to figure out what was going on. The page in question has absolutely nothing on it that is controversial or valuable in any way. It doesn't even have any products photos.
10:30 am on Dec 20, 2016 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member keyplyr is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Sept 26, 2001
posts:12913
votes: 891


Often times hack attemps will choose an insignificant page as their focus. They may run their attempts from one server on one IP address or several servers on many IPs. This may continue until the end of the server instance which could be minutes, hours or days. No limits really.

Highly suggested to invest some time and get to know your raw logs better. 3rd party software (awsats) is limited and only slightly better than nothing at all.

If you have access to your server's file tree (via ftp or account shell) you might want to give it a look to see if any of your files have recently been updated or unfamiliar files added.
11:39 am on Dec 20, 2016 (gmt 0)

Administrator from GB 

WebmasterWorld Administrator engine is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:May 9, 2000
posts:25850
votes: 847


Yes, dpd1, can you not get to your server logs and look back at the history?

It could have been a hack attempt and that page may also have ended up stuck in an automated loop. Check the page, and others in case they were successful injecting their malware on the site, or leaving a rogue page/s sitting there on your server. Take a look in a directory called scripts and see if there's anything in there that looks suspicious.
10:34 pm on Dec 20, 2016 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:June 25, 2007
posts:1110
votes: 9


OK.... It started happening again in the last 24 hours, so I did look at the logs. There are clusters of hits on the page in question... each cluster being anything from 4 to about 20 in a row or more, and they're separated by about 1-2 seconds. In fact, numerous times the clusters are exactly 20 hits. Each cluster is a completely different IP. The only common thing between them is that they all have: "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)"

In FTP, I don't see anything that looks unusual. That particular page has the same time-stamp as everything else, which was the last time things were updated.
6:05 am on Dec 21, 2016 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:Apr 9, 2011
posts:15452
votes: 739


can you not get to your server logs and look back at the history

If he's on shared hosting, as most people are--at least to start with--old logs probably just don't exist. My host only keeps them 3 days by default. (Naturally I've upped it to 15 days, which gives me ample time to download logs even if it's been one of *those* weeks. And then they stay on my hard drive forever.)

Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)
Funny you should say that. When my host added mod_security, one of the user-agent patterns they flagged was
Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1)
With or without the SV1 you may safely block them; there's no danger of inadvertently excluding humans.

Idle query: Does anyone, anywhere, still use MSIE 6? Poor countries in eastern Europe prefer Opera; third-world countries prefer off-brand mobile devices. Even dirt-poor governments must be on 7 or 8 by now. (I thought I'd found some human MSIE 6, but it's because I was casting my net a little too widely and landed some from 2011.)
6:30 am on Dec 21, 2016 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:June 25, 2007
posts:1110
votes: 9


Thanks... That's correct, they only gave me access to a certain amout of the logs. However, they just updated, and it appears I can set it to download all the time now.

I did a search and that specific user agent comes up on this list: [projecthoneypot.org...]

I have no idea how to block stuff. I would be afraid to screw something up, since this is a sales site. The page in question has no email tag or comment fields, or anything like that, so I don't know why it finds it so interesting. I noticed that the traffic claimed to be from China, went up once again, at the same time these hits went up today.
6:53 am on Dec 21, 2016 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:June 25, 2007
posts:1110
votes: 9


I ran all of the IPs, and so far all of them do come back to China. Some were even listed to some sort of "advertising" company there. I don't do any business with China, but I do get a crap ton of spam from them. Should I just block these IPs?
7:15 am on Dec 21, 2016 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member keyplyr is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Sept 26, 2001
posts:12913
votes: 891


I have no idea how to block stuff.
Should I just block these IPs?
They will likely stop on their own.

I would never advise blocking anything unless you know what you're doing.

Do some reading here in these different forums, especially in the Apache, Webmaster General and the Search Engine and User Agent forums.

Just know that if you block anything, you need to keep consistent, daily watch on your logs to see just who is getting blocked.
10:28 am on Dec 21, 2016 (gmt 0)

Administrator

WebmasterWorld Administrator phranque is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Aug 10, 2004
posts:11571
votes: 182


you might consider using some methods outlined in lucy24's post here:
https://www.webmasterworld.com/apache/4827483.htm?highlight=msg4827679#msg4827679
6:54 pm on Dec 21, 2016 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:June 25, 2007
posts:1110
votes: 9


Thanks. What she's doing is probably a little over my head. I just thought maybe I could at least block the offending IP in IP blocker in cPanel, or maybe even look up the range that each offending IP is in, and just block the whole range. I get nothing but grief from China on multiple levels, so I am not worried about losing traffic from there.
7:34 pm on Dec 21, 2016 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:Apr 9, 2011
posts:15452
votes: 739


post here

Jiminy, phranque, do you realize that when you put "highlight" in the URL, the whole post comes through with a REPORTED label as if there was something wrong with it? Gave me an awful fright.

I get nothing but grief from China on multiple levels, so I am not worried about losing traffic from there.

I just block robots that claim to speak Chinese. This is a little unfair to humans in Hong Kong and Taiwan, but oh well.
1:58 am on Dec 22, 2016 (gmt 0)

Administrator

WebmasterWorld Administrator phranque is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Aug 10, 2004
posts:11571
votes: 182


Jiminy, phranque, do you realize that when you put "highlight" in the URL, the whole post comes through with a REPORTED label as if there was something wrong with it? Gave me an awful fright.


haha sorry i never noticed that.
that's what i get for hacking urls...
10:55 pm on Dec 23, 2016 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:June 25, 2007
posts:1110
votes: 9


So I've been going through the logs, and... I'm quickly realizing that blocking IPs is a lost cause. I've only gone through a few days, and I already have a giant list of IPs that are doing this. Only a few have duplicated, or are even neighboring. I would have to ban giant chunks to make a dent, which makes me a little nervous. Do you Lucy or anyone know how I could ban that user agent on a shared host? It seems like all I can do through cPanel is use the IP block. I don't see any way to ban a user agent, if there is one. Thanks.
11:19 pm on Dec 23, 2016 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member keyplyr is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Sept 26, 2001
posts:12913
votes: 891


I'm quickly realizing that blocking IPs is a lost cause. I've only gone through a few days, and I already have a giant list of IPs that are doing this
Depends on how much security you want to have. The more security, the more work & time it takes. I block over 6800 ranges.

When dealing with IP ranges, it's important to understand the difference between an IP address that is compromised temporarily and one that is home to malicious actors long-term.

Much of this distinction is discussed in the Search Engine & User Agent forum [webmasterworld.com] and especially in the ongoing Server Farm [webmasterworld.com] discussion.
3:01 am on Dec 24, 2016 (gmt 0)

Administrator

WebmasterWorld Administrator phranque is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Aug 10, 2004
posts:11571
votes: 182


if you try solving this problem by banning IPs and ranges and user agents, you'll be playing an unwinnable game of whack-a-mole and chasing Donkey Site Banger Gold [webmasterworld.com] scrapers.

i'm going to predict that if you study the patterns long enough you will discover that a large majority of your non human requests don't provide an Accept header or specify a preferred language.
(you won't find this in your server logs.)
a ban on this traffic requires no ongoing maintenance.
5:21 am on Dec 24, 2016 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:Apr 9, 2011
posts:15452
votes: 739


blocking IPs

I hope you are not bothering about exact IPs in the form
11.22.33.44.
Instead, when you find an offending IP, look it up and learn its whole range, for example

11.22.32.0/19 (which means 11.22.32.0 through 11.22.63.255)
or
11.22.0.0/17 (11.22.0.0 through 11.22.127.255)
or maybe even
11.20.0.0/14 (11.20.0.0 through 11.23.255.255)

If the overall range looks like a human ISP (you'll recognize the big names within your own country), it's an infected browser and you can just ignore it--unless it's from some country that you consider expendable. If it's a server farm or colo (look for name elements like "Rack" or "Host"), go ahead and block the whole thing.
7:52 pm on Dec 24, 2016 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:June 25, 2007
posts:1110
votes: 9


Thanks. So far I've been going through my IP list for 2016 and going after the biggest offenders, who also hit numerous times across the whole year. I figure that way they're not just one-timers. Then I check the location, and check it on a couple sites, to see what other people have reported and what neighboring IPs are doing. Then yes, I ban the 0-255 range. Some of my biggest ones seem to be very well known and pretty permanent. So I will at least get rid of those. I wish I had the nerve to do a whole country ban, but I worry about accidentally hitting the wrong people. It's a multi problem thing with me, because not only do I get stuff like this from China, I have also dealt with product design theft and other things from there.
8:48 am on Dec 28, 2016 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:June 25, 2007
posts:1110
votes: 9


I decided to give Incapsula a try. After reading about it, I realized it would give me the most easy way of doing what I want, with added benefits as well. Unfortunately... After spending 3 days working on a block list and getting it all ready, I looked at US ranges and realized how easy it would be to make a mistake. I also realized I don't want to have to keep going through this constantly. The internet is only a small portion of my responsibilities. A CDN won't fix everything, but I think it will cut down a lot of the nonsense.
9:35 am on Dec 28, 2016 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member keyplyr is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Sept 26, 2001
posts:12913
votes: 891


I decided to give Incapsula a try
That may turn out to be a good fit for you. Let us know how it goes.
7:53 pm on Dec 29, 2016 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:June 25, 2007
posts:1110
votes: 9


So far, Incapsula is pretty awesome. I just went with trying the lowest free plan, but that still offers a lot of features. I was a little annoyed when I realized they stick a pop-up icon on the lower right of the site with that. But then I decided it wasn't that bad. Considering what you get for free, it's a good trade. And it basically just looks like a little security badge or something. Blocking by IP ranges or country is extremely easy. Then you get live detailed reports on everything that was blocked that way, plus anything that they just decided to block on their own... which is a lot. I blocked one small county which is known for being a problem, and four foreign IP ranges that were recently very active with bots. So far, all traffic that tried to come through those has been 100% malicious in nature. So it's working well. And on top of that, my site works faster for people. Two thumbs up. Obviously you are at their mercy as far as them setting up the country block IP data. But considering how big they are and who uses them, I'm sure they are very serious about updates.
10:18 am on Jan 20, 2017 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:June 25, 2007
posts:1110
votes: 9


Just a followup... The original traffic I posted about has mostly been blocked, but I still see tons of it trying. A couple got through because they were IPs from outside the blocked country. They still do the same thing... Dozens and dozens of hits a couple seconds apart, on the same pages. Each page will get hit a bunch of times, then it comes back on a different IP and hits another page a bunch of times. Incapsula is ID'ing all of it as Wget.
 

Join The Conversation

Moderators and Top Contributors

Hot Threads This Week

Featured Threads

Free SEO Tools

Hire Expert Members