Forum Moderators: phranque

Message Too Old, No Replies

how to block traffic from a certain search engine?

         

unclej

6:01 pm on Dec 22, 2016 (gmt 0)

10+ Year Member



I use clicky for my analytics. Over the past few weeks, I have been getting a flood of spam in my analytics.
This is not your standard semalt referral spam
They are making it look like they are coming from a search engine, and they even show what keyword they used.

so every minute or so, someone from some weird country comes to the site, they could be in iraq or brazil or saudia arabia or india. they appear to be coming from some weird search engine that I have never heard of, and I can see what keyword they used, they used a keyword related to my website.

It's hard for me to explain because nobody else seems to be having this problem so nobody understands what I am talking about.

ok, so in your analytics you see the person came from yahoo and they searched for "buy mens shoes"
right?

every minute, in my analytics someone comes from this weird search engine, it's something .info and I can see the keyword they used.

How can I block them? I am sick of this.

not2easy

6:38 pm on Dec 22, 2016 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



Easiest way would be to deny access based on referer - but you should realize that serving a 403/Access Denied response does not prevent their requesting "whatever" endlessly. It can only prevent them from being served the page or URL they requested. Over time, they may or may not go away.

lucy24

7:03 pm on Dec 22, 2016 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



You said you're seeing them in analytics. Are they present in raw logs? If yes, you can use a variety of methods to block by referer. If they're only visible in analytics, you're seeing a popular recent form of referer spam; people get a lot of it in GA.

:: detour to clicky website ::

It looks as if the code lives on their server, not yours. So you have to take a look at your own raw logs first. If it turns out they're not really visiting your site, just lying to clicky about it, there's probably a setting where you can ignore all requests with suchandsuch pattern.

unclej

7:43 pm on Dec 22, 2016 (gmt 0)

10+ Year Member



Lucy,

I am not sure what you mean by raw logs. But these appear to be real people and not bots because they click on things (I know that can be spoofed) and they come from different platforms like android phone, windows 7 (I know these can be spoofed too)
but what convinced me that they are real people: a couple of times they submitted my contact form. My contact form has a anti-spam field "enter any 5 digit number" and they entered a 5 digit number, it's unlikely that a bot would do that.

is it really that popular? I asked on another forum and everybody assumed I am talking about the semalt kind of spam. This seems a little different to me, they appear to be searching for a term related to my website.

my theory is they want me to say "wow, I am getting so much traffic from this .info search engine, what is it? let me visit their site. and then I get a virus?

I tried blocking by referrer but although that seemed to reduce the spam, it didnt stop it, the spam continues.

lucy24

10:43 pm on Dec 22, 2016 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I am not sure what you mean by raw logs.

The ones that look like this:
5.248.196.220 - - [21/Dec/2016:03:35:01 -0800] "GET /fonts/ HTTP/1.1" 403 3490 "http://example.spam/" "Mozilla/4.0 (compatible; MSIE 6.0; Windows XP)" 
They'll look a little different depending on your server type, but this is the Apache subforum so I assume that's what you have. It is worth getting familiar with your logs, even if you haven't the energy to study them on a regular basis. I can't advise you on where to find them, because that depends on the host; you'll probably only have access to the last few days, but you can download those and study them at leisure.

As noted elsewhere, blocking will not stop requests from happening; it will just keep the request from succeeding. If they are genuine humans, even this will cut back on attempts, since they'll never get far enough to interact with the site. You'll only see the initial blocked page request.

One approach, suitable for Apache 2.2 (also 2.4 with mod_compat*) uses mod_setenvif in conjunction with mod_authz_thingy:
SetEnvIf Referer example\.spam bad_ref

Deny from env=bad_ref
You can also do it with mod_rewrite.


* The mod's actual name is a little longer, but everyone knows what I mean.