Forum Moderators: phranque

Message Too Old, No Replies

Cloudflare Experiences With Security Help

Will it solve my problems?

         

vegasrick

8:45 am on Feb 3, 2016 (gmt 0)

10+ Year Member Top Contributors Of The Month



Hopefully I can get some information from some of you guys.

To make a long story short.

1. I have a very popular sports site. 13 years old. 2-4 million uniques per months depending on whats going on.

2. Last month we ported out site from a 10 year old CMS (mostly HTML and PHP code) to a combo of Laravel/Codeignitor.

3. After a month, we started getting flagged by a company called Ad Intregral Science (AIS for short). They monitor traffic for ad networks. Basically they claimed the majority of the traffic their clients were receiving were automated or fake. They grilled me if I was buying traffic for the site, which was a joke to me. Three ad networks who work with them stopped serving ads to our site because AIS gave us a low score for our traffic in their database.

4. AIS won't provide me with logs of the bad traffic. They offered to set me up with a traffic monitoring package which starts at 3,000 per month!

5. Unfortunately we have our server logs turned off because we get so many hits that they grow humongous in size within a day or two. My server team ran a few scans at one point but the only thing they saw that was a very high volume of search bots from Google and Bing (which makes sense since we have millions of pages due to our large forum)

6. Google Analytics are fine and nothing strange there, but they don't really log bot traffic, especially malicious.

7. This past weekend another ad network drops us that work with AIS, claiming that 68% of my traffic was suspicious over a three day span.

8. I obviously get frantic, because someone is either out to hurt our site by directing automated traffic or just some hacking program that won't stop coming at us.

9. I thought about activating mod_security but it's sometimes too sensitive (from my experience) and not the best for every situation.

10. So I look for some security and go to Cloudflare (based on price as some of the others change by traffic volume). I get their business package (200 per month).

11. In the last 24 hours alone they challenged or blocked 11,000 threats (or what they felt were threats).

These are my questions.

How well does Cloudflare work in terms of security to deal with these kinds of issues with obvious automated traffic, fake traffic, bots attacking the site?

Are there any functions on Cloudflare that I should have on that are not on by default to deal with this?

Is there someone else that I should be using (in terms of a service better fit to deal with this issue)?

I'm paranoid that if Cloudflare does not solve this, I could keep losing ad networks until we go under and me and a few staffers work this site as our full time jobs and have done so for years. Or worse, I get stuck with paying AIS 3K per month to monitor my traffic.

Hope some of you with more experience with these issues can help me here. In my 13 years of running the site, we've had our share of hackings, attacks, but nothing like this with automated traffic attacks until or bot floods.

Swanny007

8:38 pm on Feb 3, 2016 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I love CloudFlare and I use it on all of my larger sites. It does a much better job of blocking bad traffic than I can do.

I am not at all familiar with AIS but I'd say off hand any real techie questions should be directed to CloudFlare now that you're on there.

What I have done is when I notice bots I don't like in my logs (e.g. ia_archiver, LInkDex) I go and manually block IPs in CloudFlare. But that is a losing battle, it helps though.

grouchy sysadmin

3:04 am on Feb 4, 2016 (gmt 0)

10+ Year Member



I've never used CloudFlare, but it sounds like addressing point 5 would give you a better idea of what type of long term solution would be needed. You could try using Piwik to process log files, and give you insight into the bot traffic. Once you know what to block, I'd consider adding Nginx as a reverse proxy to filter inbound traffic. It's more work than CloudFlare, but it can be tailored to your specific site.

tangor

3:13 am on Feb 4, 2016 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



AIS is a middleman involved in traffic monitoring. Some say they are shakedown artists, some claim they offer advertisers protection. I don't know which is true, but once they get in the middle (from the advertisers point of view) you either play their game or lose the advertiser.

Might want to look at selling and serving your own ad space, given the metrics you've revealed. You might do ten fold your present business by doing it in house... and never worry about AIS or others again. (Been there, done that, will do it time and again, but it is more work and will fill your ordinary day!)

vegasrick

9:00 am on Feb 4, 2016 (gmt 0)

10+ Year Member Top Contributors Of The Month



@tangor, they are basically trying to shake me down. None of the other monitoring systems are picking up their claims. I'm hoping to raise my scores with them through Cloudflare to avoid paying them 3K for a month of monitoring.

vegasrick

10:12 pm on Feb 4, 2016 (gmt 0)

10+ Year Member Top Contributors Of The Month



Is this normal?

According to Cloudflare these are our bot hits, search engine wise, for the last 24 hours. And Bing is cracking us hard - 247K!

Bing247,092
Google50,589
Yandex9,044
MSN5,237
Baidu2,486

Swanny007

10:31 pm on Feb 4, 2016 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Bing crawls way more than Google in my CloudFlare stats too. I gave them a 5-second delay in robots.txt and that fixed the issue. I don't mind them crawling but considering how much traffic they send compared to GOOG I said too bad, slow down. Your numbers look normal to me.

vegasrick

10:38 pm on Feb 4, 2016 (gmt 0)

10+ Year Member Top Contributors Of The Month



@swanny, how do I set that up? The delay that is?

Would a delay in any way affect them indexing our stories into bing news?

grouchy sysadmin

12:34 am on Feb 5, 2016 (gmt 0)

10+ Year Member



It can be done by adding a crawl-delay to the robots.txt. For example,

User-Agent: bingbot
Crawl-delay: 5

See [blogs.bing.com...] and [blogs.bing.com...] for details.

vegasrick

12:51 am on Feb 5, 2016 (gmt 0)

10+ Year Member Top Contributors Of The Month



Grouchy, is there any drawback to doing that? I read on Bing's site how they recommend keeping it at "1"

grouchy sysadmin

1:11 am on Feb 5, 2016 (gmt 0)

10+ Year Member



The higher the delay, the less pages are crawled and thus, the possibility of it taking longer for pages to show up in the index. I suspect the problem would become worse with the more pages you have.

I would not slow down the bingbot or any search engine, if you rely on your content being indexed in a timely fashion. I also think it unlikely that the bingbot is causing the issue with AIS, assuming there actually is a real issue.

vegasrick

1:22 am on Feb 5, 2016 (gmt 0)

10+ Year Member Top Contributors Of The Month



@grouchy, AIS seems to be auditing traffic different and I don't know what's what. Another advertiser that I use, had 9 million of my impressions audited and with them (last 30 days) and only found 8% traffic fraud from that 9 million. But when AIS audited another network that I use, and they flagged 68% during a 3 day period. Makes no sense.

keyplyr

6:48 am on Feb 5, 2016 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



5. Unfortunately we have our server logs turned off because we get so many hits that they grow humongous in size within a day or two. My server team ran a few scans at one point but the only thing they saw that was a very high volume of search bots from Google and Bing (which makes sense since we have millions of pages due to our large forum)
IMO you are unwise to not keep a diligent eye on server access logs. This is your main source of information. Turning this feature off puts you at a huge disadvantage.

If your logs get that large, download them several times a day and analyze them in sections. Don't let them compile, set them to start a new file every 24 hours... or even 12 or 6 hours.

Since you have a "server team" this should not be an issue, however if "the only thing they saw that was a very high volume of search bots from Google and Bing" maybe they need to take some classes on how to read log data. There's a lot more going on than just that.

There's also lots of traffic analytic software, some even free, that will show you quite a bit of information without spending $3k a month to a 3rd party service. I wouldn't trust the so-called "stats" from CloudFare. A huge number of bot hits are fraud (spoofed UAs) and can be blocked. The only way for you to control this is take control of your data and not rely on 3rd parties, who by definition, are selling you their products.

robzilla

8:08 am on Feb 5, 2016 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



they grow humongous in size within a day or two

Is there no log rotation in place?

vegasrick

8:26 am on Feb 5, 2016 (gmt 0)

10+ Year Member Top Contributors Of The Month



@robzilla, for whatever reason the log rotation would break at times without warning. We don't realize until the entire site starts bogging down as the servers begin to run out of room.

moTi

10:07 am on Feb 5, 2016 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



you are a webmaster with many years of experience and millions of uniques/month. you and your team depend with their livelihood on your website.

crawl-delay, log-rotation, logfile shaping. sorry, these are absolute basics in webmastering.

now you get screwed by some shady middlemen and desperately flee to cloudflare with no clue what's really going on there.

get some people on board who know what they're doing.

i am always astonished about the questions from some seniors around here. i can only shake my head in astonishment of your great success without knowing even the fundamental issues of webmastering. you can feel very lucky that you have come so far. i suspect it's the mercy of an early internet presence. hell, i'd kill for your traffic.

[edited by: moTi at 10:23 am (utc) on Feb 5, 2016]

vegasrick

10:20 am on Feb 5, 2016 (gmt 0)

10+ Year Member Top Contributors Of The Month



@moti, our success is based on our A-level content. We are a recognized leader in our field in terms of breaking news stories and 24 hour coverage, which has little to do with webmaster tips. I know a lot of guys who are great webmasters and cant draw flies to their fancy looking websites. I focused on three things, content, SEO and monetization. We pump out as many as 40 stories per day, we have great search engine spots and millions of pages indexed and we make a lot of money.

If it was something related to a certain code, or a script, or a hacking attempt, then I would know it like the back of my hand. But in 13 years I've never come across a "bot" issue and it's tough to figure out when four of IAS' competitors are telling me my traffic is just fine.

wilderness

11:27 am on Feb 5, 2016 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



@robzilla, for whatever reason the log rotation would break at times without warning. We don't realize until the entire site starts bogging down as the servers begin to run out of room.


1) If the rotating logs would break and 'nobody noticed', than somebody was lax in their duties and/or not reviewing the logs.
2) Is space so limited on the server that log files will consume the maximum server space? Are the logs ZIPPED?
You need a larger capacity server or host!
3) Rotating logs are generally accomplished via cron jobs, which do occasionally fail (at least in some capacity), however they don't generally cease unless there is another server issue affecting all-crons.

BTW, I'm in agreement with the others, in that, without reviewing your logs there is not any evidence to support AIS' claims.
First thing you need to do is reactivate logs and establish a procedure for regular-review of same logs.

keyplyr

11:52 am on Feb 5, 2016 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



we have millions of pages due to our large forum
You may wish to consider purging some of the archaic forum threads, or at least compressing them for storage, if they are not being actively engaged by humans and only end up as spider food. They probably don't have much of an impact on ranking any longer, and it would free-up some space and may increase speed.