homepage Welcome to WebmasterWorld Guest from 54.167.11.16
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
Forum Library, Charter, Moderators: Ocean10000 & incrediBILL

Search Engine Spider and User Agent Identification Forum

    
sub-semalt
wilderness




msg:4694299
 2:25 am on Aug 10, 2014 (gmt 0)

anybody have a clue if this correct syntax?

#any two numbers
RewriteCond %{HTTP_REFERER} ^http://[0-9]{2}\.semalt\.com/

 

iamzippy




msg:4694346
 11:03 am on Aug 10, 2014 (gmt 0)

It works in Regex Buddy.
So does:

RewriteCond %{HTTP_REFERER} ^http://\d\d\.semalt\.com/

wilderness




msg:4694350
 1:01 pm on Aug 10, 2014 (gmt 0)

Many thanks

Pfui




msg:4694358
 1:53 pm on Aug 10, 2014 (gmt 0)

There are assorted subdomains (e.g.: http://semalt.semalt.com/) thus --

RewriteCond %{HTTP_REFERER} semalt
RewriteRule .* - [F]

-- works for me. (Ditto for fellow pest kambasoft.)

ronin




msg:4694362
 2:24 pm on Aug 10, 2014 (gmt 0)

I'm still learning (and improving) my regex skills, but I am using this:

RewriteCond %{HTTP_REFERER} ^https?://([a-z0-9-]+\.)?semalt\.com [NC]
RewriteRule .* - [F]

not2easy




msg:4694375
 3:50 pm on Aug 10, 2014 (gmt 0)

Couldn't these be combined like we do for UAs?

RewriteCond %{HTTP_REFERER} (kambasoft|semalt|whatever) [NC]
RewriteRule .* - [F]

wilderness




msg:4694380
 4:27 pm on Aug 10, 2014 (gmt 0)

certainly.

dupres01




msg:4694397
 5:45 pm on Aug 10, 2014 (gmt 0)

which is the better form to use (and, to help with my education, why)?
this one:
RewriteCond %{HTTP_REFERER} (kambasoft|semalt|whatever) [NC]
RewriteRule .* - [F]

or this one:
RewriteCond %{HTTP_REFERER} kambasoft [NC,OR]
RewriteCond %{HTTP_REFERER} semalt [NC,OR]
RewriteCond %{HTTP_REFERER} whatever [NC]
RewriteRule .* - [F]

wilderness




msg:4694398
 5:50 pm on Aug 10, 2014 (gmt 0)

the combined line will be less server strain and slightly faster.

Both do however work.

I've one for "crawler" as well, however for simplicity sake and to possibly stop another stray bot, you could use "crawl".

lucy24




msg:4694414
 7:05 pm on Aug 10, 2014 (gmt 0)

The form
#any two numbers
RewriteCond %{HTTP_REFERER} ^http://[0-9]{2}\.semalt\.com/

is syntactically correct, but I suspect it's easier on the server if you simply say
^http://[0-9][0-9]\.semalt\.com
That's assuming it will always be exactly two. Otherwise of course you'd go to
[0-9]+
Or-- my preference-- \d for a savings of three bytes ;)

fwiw, mine simply says

SetEnvIf Referer semalt keep_out

It's in mod_setenvif because this rule is in my shared htaccess used by all sites. If it were for a single site it would be expressed as a RewriteCond along with assorted other referer-based lockouts.

My impression is that semalt works 100% via infected human browsers, because they always ask for favicon and stylesheet. Robots normally don't. Did anyone ever figure out what they want?

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved