Welcome to WebmasterWorld Guest from 54.221.75.115

Forum Moderators: Ocean10000 & phranque

Message Too Old, No Replies

So getting invalid traffic on my apache server

     
9:36 am on Mar 6, 2018 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month

joined:Dec 19, 2004
posts:887
votes: 11


My apache webserver site is now getting invalid traffic. Any advice how to block these?

Also, Cloudflare just got this new firewall feature "Create a rule to block or challenge a specific User Agent from accessing your site"

Is there any particular malicious user agent that I can block using this rule? Any recommendations please? Thanks!
10:08 am on Mar 6, 2018 (gmt 0)

Moderator from US 

WebmasterWorld Administrator keyplyr is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Sept 26, 2001
posts:12913
votes: 890


We've had this discussion before. Same answer... check your server logs to determine who/what is the cause.

Blocking Methods [webmasterworld.com]

Search Engine Spider & User Agent ID Forum [webmasterworld.com]

Server Farm IP Ranges [webmasterworld.com]
11:30 am on Mar 6, 2018 (gmt 0)

Junior Member

joined:Feb 22, 2018
posts:146
votes: 22


What do you call "invalid traffic" ?
11:41 am on Mar 6, 2018 (gmt 0)

Moderator from US 

WebmasterWorld Administrator keyplyr is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Sept 26, 2001
posts:12913
votes: 890


He means what Adsense is calling invalid... bots.
12:30 pm on Mar 6, 2018 (gmt 0)

Junior Member

joined:Feb 22, 2018
posts:146
votes: 22


He means what Adsense is calling invalid... bots.

Ok, I see.

I wonder if we (publishers) can really do something helpful, in blocking invalid traffic. Yes, we can block "known" user agent, "known" IP range, things like that , but since they are "known", I would assume that Adsense already know them and is already blocking them before an ad is served, and therefor clicked.

I am talking about invalid traffic regarding Adsense, not about scrappers and things like that, which is, of course, something we have to address ourselves.
12:43 pm on Mar 6, 2018 (gmt 0)

Moderator from US 

WebmasterWorld Administrator keyplyr is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Sept 26, 2001
posts:12913
votes: 890


Adsense doesn't block any requests. Adsense doesn't control our server. Once the Adsense code is on our pages, it is our responsibility.

Adsense does however invalidate some clicks it determines to not be human... and even some that are human.

If Adsense sees this to be a significant number of invalid clicks, it then comes after publishers to fix the problem.
12:47 pm on Mar 6, 2018 (gmt 0)

Junior Member

joined:Feb 22, 2018
posts:146
votes: 22


Adsense doesn't block any requests

Are you sure of it ? I can't believe that Adsense servers are answering all Ad requests without a minimum of filtering, otherwise it would be easy to DDos Adsense all the time.
12:56 pm on Mar 6, 2018 (gmt 0)

Moderator from US 

WebmasterWorld Administrator keyplyr is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Sept 26, 2001
posts:12913
votes: 890


Adsense doesn't block requests on *our* server. The requests are made to fulfill the files linked to the webpage, including the Adsense code.

Whether the ad is served to what they consider to be invalid or not is represented by the clawbacks. So yes, ads are shown to invalid UAs. Maybe some are not shown. That would likely remain an unknown number.
7:37 pm on Mar 6, 2018 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:Apr 9, 2011
posts:15257
votes: 692


Maybe some are not shown.
Aren't most ads--whether through AdSense or other source--ultimately javascript based? The vast majority of robots don't even request, let alone act on, scripts.
8:40 pm on Mar 6, 2018 (gmt 0)

Moderator from US 

WebmasterWorld Administrator keyplyr is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Sept 26, 2001
posts:12913
votes: 890


@lucy24 - Clickbots, those creatures that intentially cause false positives for Adsense publishers, are purposed to do just that... follow JS pretending they are human.
10:06 pm on Mar 6, 2018 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member tangor is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Nov 29, 2005
posts:8727
votes: 699


Manage the bots. Always. As for which UAs to manage, that can change minute by minute. A never ending case of whack-a-mole. Or, you can whitelist and poke holes as needed. Both methods work.
3:45 am on Mar 7, 2018 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month

joined:Dec 19, 2004
posts:887
votes: 11


Should I block blank user agents?
3:52 am on Mar 7, 2018 (gmt 0)

Moderator from US 

WebmasterWorld Administrator keyplyr is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Sept 26, 2001
posts:12913
votes: 890


Should I block blank user agents?
I always have. Be careful though. You may need to allow some IP ranges to use a blank UA.

example: Facebook
If you post promotional material at FB with images, FB will periodically use a blank referrer to update it's cache of those files.
4:03 am on Mar 7, 2018 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month

joined:Dec 19, 2004
posts:887
votes: 11


Oh what the heck this fake traffic issue is a mess and I need to dive in.. secondly there is a huge market for any new company to help with this farking mess!
5:54 am on Mar 7, 2018 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:Apr 9, 2011
posts:15257
votes: 692


FB will periodically use a blank referrer to update its cache of those files
Blank referer or blank UA?
6:03 am on Mar 7, 2018 (gmt 0)

Moderator from US 

WebmasterWorld Administrator keyplyr is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Sept 26, 2001
posts:12913
votes: 890


Blank UA... but actually both.

Thanks for the heads-up :)
8:28 am on Mar 7, 2018 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month

joined:Dec 19, 2004
posts:887
votes: 11


Thanks guys I’ll be sitting with a microscope and blocking bots as a first step
12:33 pm on Mar 7, 2018 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month

joined:Dec 19, 2004
posts:887
votes: 11


So I have option to block blank referrers (which I've decided I won't) but surely I can block blank User agents? FB will have probs then? Kindly advise. Thanks!
12:44 pm on Mar 7, 2018 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month

joined:Dec 19, 2004
posts:887
votes: 11


Also I checked and found this link on Facebook's guidelines for webmasters:

[developers.facebook.com...]

They seem to have removed the blank UA issue no?
7:49 pm on Mar 7, 2018 (gmt 0)

Moderator from US 

WebmasterWorld Administrator keyplyr is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Sept 26, 2001
posts:12913
votes: 890


As I said above, one of FBs image caching agents will not use a UA nor a referrer. I post a lot at FB and see this empty string in my logs every single day (I'm looking at it now.)

So if you block blank UAs and do not allow the various IP ranges used by FB, your images will start to disappear from your FB posts.

There are several ways to accomplish this. Here's one way using htaccess that allows the several FB UAs, including blank referrer, from FB ranges:

RewriteCond %{HTTP_USER_AGENT} ^-?$ [OR]
RewriteCond %{HTTP_USER_AGENT} ^(facebook|Facebot|visionuti)
RewriteCond %{REMOTE_ADDR} !^31\.13\.(6[4-9]|[789][0-9]|1[01][0-9]|12[0-7])\.
RewriteCond %{REMOTE_ADDR} !^66\.220\.1(4[4-9]|5[0-9])\.
RewriteCond %{REMOTE_ADDR} !^69\.63\.1(7[6-9]|8[0-9]|9[01])\.
RewriteCond %{REMOTE_ADDR} !^69\.171\.2(2[4-9]|[34][0-9]|5[0-5])\.
RewriteCond %{REMOTE_ADDR} !^173\.252\.(6[4-9]|[789][0-9]|1[01][0-9]|12[0-7])\.
RewriteRule - [F]


There also may be other beneficial agents that use a blank referrer. This is why you need to do the research before you start blindly blocking access to your server.

Start by watching your server logs several times a day for a few months... to learn what agents access your server and who/what they are.

- - -
9:39 pm on Mar 7, 2018 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:Apr 9, 2011
posts:15257
votes: 692


Clickbots

Thanks, keyplyr, although I've got a nasty suspicion this is one of those things that get explained to me over and over again and it never sinks in.

one of FBs image caching agents will not use a UA nor a referrer.
:: detour to raw logs ::

Well, ###, I thought this only applied if you were personally active on FB so they were following-up on your own posts. Note that if you've got an IPv6 address, they will most likely come in from
2a03:2880::/29
(amusingly, out of this vast range of IPs-- /29 is a lot bigger in 6 than it was in 4-- they always pick the ones containing the string :face: ) But so far they've always had a UA from this range.

[edited by: lucy24 at 10:02 pm (utc) on Mar 7, 2018]

9:43 pm on Mar 7, 2018 (gmt 0)

Senior Member

WebmasterWorld Senior Member wilderness is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Nov 11, 2001
posts:5496
votes: 3


born2, I didn't allow FB traffic for the longest while. FB is the most invasive server/software on the www (worse than any virus or malware).

Eighteen months ago, changed my tune due to widget topics.
The blank UA's are their problem, not mine.
There's far worse than blank UA's?
Wait until a FB user embeds a URL and exposes all the thread users IP's to your raw logs.

Personally, I place FB in a similar category as all the Wiki pages. There's is no real benefit (at least over time) to your site (s) because 99.99 of users merely view the one page. Most FB users have zero knowledge of primary www search engines (nor do they care), rather they are trapped into FB (like Wiki limitations for external links. Most don't even utilize the FB search options, rather they simply ask another user (pure laziness of social interaction).
9:58 pm on Mar 7, 2018 (gmt 0)

Moderator from US 

WebmasterWorld Administrator keyplyr is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Sept 26, 2001
posts:12913
votes: 890


FB can be a huge source of revenue generating traffic if nurtured and used properly, especially FB Groups.







[fix typo]

[edited by: keyplyr at 10:00 pm (utc) on Mar 7, 2018]

9:59 pm on Mar 7, 2018 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member tangor is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Nov 29, 2005
posts:8727
votes: 699


Personally I haven't allowed blank referers (sic) or UAs for the last 12 years. FB traffic still comes, from that link in SM. FB is not really a search engine and doesn't want to be, so I accommodate in that regard.
10:24 pm on Mar 7, 2018 (gmt 0)

Senior Member

WebmasterWorld Senior Member 5+ Year Member Top Contributors Of The Month

joined:Sept 14, 2011
posts:1045
votes: 132


Personally I haven't allowed blank referers (sic) or UAs for the last 12 years


Jeez remind me not to type your url into the search bar
10:33 pm on Mar 7, 2018 (gmt 0)

Senior Member

WebmasterWorld Senior Member wilderness is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Nov 11, 2001
posts:5496
votes: 3


tangor, my denial of blank UA's is at least fifteen years (possibly longer).
It's almost step I in the htaccess bible <BG>
10:36 pm on Mar 7, 2018 (gmt 0)

Junior Member

joined:Feb 22, 2018
posts:146
votes: 22


There is also malformed User Agent, which can be an indication of a bot.

Also, I don't know what to think about requests from IP which do not have a hostname. I always finds this suspect. I assume (may be wrong) that all legitimate ISP should set up a hostname for each of their IP.
10:58 pm on Mar 7, 2018 (gmt 0)

Senior Member

WebmasterWorld Senior Member wilderness is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Nov 11, 2001
posts:5496
votes: 3


Travis, you referring to 'Private Customer'?
12:37 am on Mar 8, 2018 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:Apr 9, 2011
posts:15257
votes: 692


remind me not to type your url into the search bar
And whatever you do, don't bookmark a page, no matter how often you visit.

:: noting sadly that the conceptual merging of “search bar” and “address bar” proceeds apace ::
 

Join The Conversation

Moderators and Top Contributors

Hot Threads This Week

Featured Threads

Free SEO Tools

Hire Expert Members