Forum Moderators: phranque

Message Too Old, No Replies

Blocking referer in htaccess

Will this htaccess coding work?

         

grandma genie

1:02 am on Feb 14, 2011 (gmt 0)

10+ Year Member



Hi Jim et al,

I get lots of hits in my server logs from forums. I don't think these folks are actually visiting my site, I think someone on the forum tried to hot link an image and whenever anyone accesses that page of the forum, the 403 appears in my logs from the hotlinked image. These hits seem to be blossoming.

One of the referers of this type of hit is dlisted.blogspot.com. Most of the hits also are from overseas IPs, such as Poland, Iran, Saudi Arabia, Russia, etc. I was blocking the IPs in htaccess, but I decided to try blocking the referer in htaccess like this:

# Return 403-Forbidden response for bad referers
RewriteCond %{HTTP_REFERER} dlisted\.blogspot\.com [NC]
RewriteRule ^ - [F]

To my surprise, all hits from that referer have stopped. I thought I would see more 403s. If I block all offending referers like this, would it stop all of them from showing up in my logs? Here are some examples of the referers I see:

cienporcientoperros.blogspot.com
forumoyorkach.proste.pl
www.guashan.com
globalmalaysians.com

Also, do I need to include the http:// or the www in the RewriteCond?

I appreciate your comments.

Jeannie

wilderness

2:04 am on Feb 14, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



To my surprise, all hits from that referer have stopped. I thought I would see more 403s. If I block all offending referers like this, would it stop all of them from showing up in my logs?


denial of access does not prevent requests from appearing in the raw logs.
However if you run your own server, you may configure raw log output to exclude 403's.

The previous disappearance of your successful example could just be a compliance by the referring website to remove outbound links to your website.
Archive.org does something similar (adds a notation of requested compliance) when either robots.txt (compliance) or 403's result.

wilderness

2:09 am on Feb 14, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Also, do I need to include the http:// or the www in the RewriteCond


No.
And in most instances it's overkill.
You may use a single word, string-of-words, or multiple words (different and multiple websites) enclosed in parentheses and separated by the pipe character.

(brown|blue|Red) [NC]

You might also wish to include the no-case flag (above)

It all depends upon your wish for accuracy and elimination of innocents.

Please keep in mind that refer based denies are easily defeated and less than 100% accurate.

grandma genie

6:28 am on Feb 14, 2011 (gmt 0)

10+ Year Member



OK, thanks Wilderness. I don't think most of these hits are actual visitors, so I don't think they would even notice if I blocked the forum that has the offending image. But if it encourages the forum to remove the link, that would be great. I appreciate the suggestions for fine tuning the htaccess coding.

jdMorgan

10:05 pm on Feb 17, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



These might just be 'bot hits using those forums' URLs as referrers. When the 'bot sees your 403, it stops visiting. In this respect, you're lucky, since most 'bots are brain-dead and just pound their heads on the door even if it's closed and bolted shut...

However, it was definitely not a direct cause-and-effect between your 403 and the disappearance from your logs of any more requests. Either a person or a 'bot decided to stop what they were doing -- hotlinking, visiting, etc.

Determining a hot-link is easy: Requests for *page* URLs are not generally hotlinked requests. Requests for *objects included in your pages* such as images, .css, and .js files referred by any domain other than your own are what we call hotlinked requests.

Jim

grandma genie

4:47 am on Feb 18, 2011 (gmt 0)

10+ Year Member



Hi Jim,
I think most of the referer hits are from my images being used on the various forums. Because I ban hot linking in htaccess, the image does not show, but whenever someone visits that forum page, I get a 403 in my logs. I don't think there is a way to stop it without contacting the forum and asking them to remove the link. But many of the forums are in foreign languages and I can't communicate with them or the forum is adult content and I really don't even like going on those. I think the drop in visits from dlisted.blogspot.com was just coincidence. Too bad. My error logs really are not that big. So other than these hits being a nuisance, there isn't much I can do to stop them. I was just hoping to be able to clean up the error logs a bit. Most of the errors are 403s and, as I said, coming from hot linked images in forums.

jdMorgan

5:16 am on Feb 18, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Ah, OK... Well, my 403 log is *enormous*.

And it just makes me smile.

Much better than having to generate a big pile of DMCA complaints.... :)

Jim

micklearn

6:30 am on Feb 18, 2011 (gmt 0)

10+ Year Member



Hey, Jim,

Just a quick question...someone advised me to use:

RewriteRule .* - [F]

rather than:

RewriteRule ^ - [F]

(per the OP example)

What does the difference in the code mean exactly? I want make sure that I *am* blocking certain companies/sites completely.

Thanks,
Mick

jdMorgan

5:14 pm on Feb 18, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Ask "someone" why he recommended the longer pattern -- I don't know.

The ".*" pattern means "match any number of any characters anywhere in the requested URL-path."
The "^" pattern means "match any URL-path that begins with anything or nothing."

So, the two patterns have identical practical effects. Mine is just shorter/faster to parse.

You could use "^", ".*", ".?", ".{0,}" or several others, all with the exact same effect in this particular application.

Jim