Welcome to WebmasterWorld Guest from 54.224.180.131

Forum Moderators: Ocean10000 & incrediBILL & phranque

Blocking *.blogspot.* hotlinking

how to block hotlinking from blogspot (which is hosted by google)

     
3:51 am on Feb 11, 2017 (gmt 0)

Junior Member from CA 

Top Contributors Of The Month

joined:Feb 7, 2017
posts: 81
votes: 4


I am trying to defend my sites from others who hotlink my images. I don't mind the odd forum hotlinking, but this gets out of hand with repeated hotlinking from the same site and web page. Usually I am successful with an http_referer rewrite condition such as:

RewriteCond %{HTTP_REFERER} ^http://.*daro*\.com/ [OR]...
RewriteRule ^.* - [F,L]

This usually works for me. That said I have 3 *.blogspot.* sites that hotlink my images that do not get banned by my usual http_referer condition. Blogspot is hosted by google, and they must have some special magic that tricks me.

I have tried this as well, which also does not work:
RewriteCond %{HTTP_REFERER} ^http://(.+\.)?blogspot [OR]
RewriteRule .*\.(jpe?g|gif|bmp|png)$ - [F,L]

What is odd is that when I delete the images they request from my directories, I see in my log that they continue to receive successful 200s and download the image, even when the image no longer exists. I contacted my host provider who could not give me a proper explanation. I then created a single pixel image and renamed it to the image they requested, which works and only costs me 35 bytes/request.

What I'd rather have is that these *.blogspot.* sites get a 403/500 when they hotlink my images, even though my single pixel image is less costly. These blogspot sites show up in my Google Analytics as well as my shortstat log, something I wish to not happen. Can anyone explain to my how they can bypass the http_referer condition? How can I ban these blogspot hotlink bandits? These two sites are very popular and really hit me hard each day with numerous (50) downloads of the same images, thus greatly affecting my bandwidth resources.

I have contacted Google about hotlinking, but as these are not my copyrighted images I cannot submit a DMCA request. Google seems to tolerate hotlinking if the image is not your copyrighted image.

Any help would be greatly appreciated. Thanks. I am new here, so hope I did not break any forum rules.
6:43 am on Feb 11, 2017 (gmt 0)

Moderator from US 

WebmasterWorld Administrator keyplyr is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Sept 26, 2001
posts:8603
votes: 381



Hi TorontoBoy and welcome to WebmasterWorld [webmasterworld.com]

Try this, much simplier and I removed a couple things that could have interferred:

RewriteCond %{HTTP_REFERER} blogspot
RewriteRule (jpe?g|gif|bmp|png)$ - [F]

[edited by: keyplyr at 6:45 am (utc) on Feb 11, 2017]

6:45 am on Feb 11, 2017 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:Apr 9, 2011
posts:13737
votes: 458


RewriteCond %{HTTP_REFERER} ^http://(.+\.)?blogspot [OR]
RewriteRule .*\.(jpe?g|gif|bmp|png)$ - [F,L]

Did you skip a line here? You can't put [OR] at the end of your last condition, or the rule will always execute. (Guess how I know this.) And, by your post, this does not seem to be the case.

Why don't you just say "blogspot" in the REFERER line, without anchors? Then you don't have to worry about protocol and subdomains.

<tangent>
There's a lot of extraneous guff in the quoted lines:
--the (.+\.)? makes the rule less efficient; you probably meant (\w\.)? but in any case isn't there always a subdomain, so why make it optional?
--the initial .* in the RewriteRule (do you have a lot of files named ".jpg" alone?) is similarly wasteful, since the sole effect of the uncaptured .* is to add a few nanoseconds to rule execution
--the [L] flag is superfluous because [F] carries an implied [L]
</end tangent>

I see in my log that they continue to receive successful 200s and download the image, even when the image no longer exists

This can only happen if the images are being rewritten to something else before the request reaches your [F] rule. You don't happen to have an additional anti-hotlinking routine elsewhere do you? I can think of at least two possible scenarios.
2:30 pm on Feb 11, 2017 (gmt 0)

Administrator

WebmasterWorld Administrator phranque is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Aug 10, 2004
posts:10858
votes: 67


is there an internal rewrite somewhere that affects those image requests?
or perhaps some type of proxy or caching was in effect when you were testing the anti-hotlinking?
2:54 pm on Feb 11, 2017 (gmt 0)

Junior Member from CA 

Top Contributors Of The Month

joined:Feb 7, 2017
posts: 81
votes: 4


Thanks keyplyr and lucy for the corrections. I have made these changes and will wait until tomorrow to see if this works. Banning with the htaccess is quite challenging. These small changes can make such a difference, even with Regex experience. I will need a couple of days to digest your comments.

As for my hotlinked images that do not exist (because I deleted the image), I have no other anti-hotlinking routines elsewhere. It is puzzling, because this happened on two independent host providers (I just moved to a new host in Dec 2016). I had hoped that the issue would go away when I migrated, but the issue migrated with me. I had thought that there was some server cache somewhere that cached commonly fetched images, but my first host provider said there was no such cache. The issue continued the very day I migrated to the second host provider.

Is there a place on this forum to discuss bot killing, bot strategies/patterns, and possible solutions? All my bot killing eventually gets implemented in my htaccess, which drew me here.
5:05 pm on Feb 11, 2017 (gmt 0)

Administrator from US 

WebmasterWorld Administrator not2easy is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Dec 27, 2006
posts:3275
votes: 160


Hello TorontoBoy and Welcome to WebmasterWorld [webmasterworld.com]

We do have a forum dedicated to sharing our issues with various UAs/bots/spiders here: [webmasterworld.com...] and it is fairly active. You may find tactics mentioned in the Charter and/or Library there which are found under the Forum Options dropdown in each forum (just above the 1st post usually).
5:49 pm on Feb 11, 2017 (gmt 0)

Junior Member from CA 

Top Contributors Of The Month

joined:Feb 7, 2017
posts: 81
votes: 4


Thanks not2easy. This library is incredible. I have a lot of reading to do. I have been bot killing for a while, but they are so numerous and creative (in an evil way).

[webmasterworld.com...]
3:33 pm on Feb 12, 2017 (gmt 0)

Junior Member from CA 

Top Contributors Of The Month

joined:Feb 7, 2017
posts: 81
votes: 4


Yesterday's htaccess changes included:

RewriteCond %{HTTP_REFERER} blogspot [OR]...
RewriteRule (jpe?g|gif|bmp|png)$ - [F]

-deleted my hotlinked image *beetle.jpg, 37.4k

My server returned:
183.171.87.204 [12/Feb/2017:01:30:19 GET /*stuff/*beetle.jpg HTTP/1.1 200 435 [*.blogspot.my...] Mozilla/5.0 (Linux; Android 4.4.2; ASUS_T00K Build/KVT49L) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/30.0.0.0 Mobile Safari/537.36
183.171.87.204 [12/Feb/2017:01:30:40 GET /*stuff/*beetle.jpg HTTP/1.1 200 435 [*.blogspot.my...] Mozilla/5.0 (Linux; Android 4.4.2; ASUS_T00K Build/KVT49L) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/30.0.0.0 Mobile Safari/537.36

So my server returned a 200 code and 435 bytes to the spammer. My image is 37.4k in size, so I was mistaken about my server returning an image I had deleted. What is my server returning for 435 bytes?

Why would by server return a 200 code and 435 bytes when I did a http_referer block? Why did it not do a 403 or 500? There are so many IPs from Malaysia and Indonesia cell/sat phones that are referred, I have been able to track and ban the IPs of a few recurring ones, and they receive 500s, but this is very laborious.

I realize that returning my single pixel image for 35 bytes is betterr than no image returning 435 bytes and much better than returning my original image for 37.4k.

Using a Tor browser, after I googled it to try to see if it is safe, I did go to the referer's page and they do get a broken image. I verified that the hotlink to my site is in their code.

Any help would be appreciated. Thanks.
3:44 pm on Feb 12, 2017 (gmt 0)

Administrator from US 

WebmasterWorld Administrator not2easy is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Dec 27, 2006
posts:3275
votes: 160


435 bytes is the size of the 403 error? If you do not have a custom 403 error page specified in your htaccess file and the host server default is plain text - or if you do, but it is plain text, then that makes sense.

As discussed above, it is best not to leave that [OR] in your condition if that is the only condition:
Did you skip a line here? You can't put [OR] at the end of your last condition, or the rule will always execute. (Guess how I know this.) And, by your post, this does not seem to be the case.
4:17 pm on Feb 12, 2017 (gmt 0)

Junior Member from CA 

Top Contributors Of The Month

joined:Feb 7, 2017
posts: 81
votes: 4


Yes, the [OR] is there because I have other http_referer conditions after this. Thanks.

I do have a custom 403.php, but it is 1.3k and not 435b. Could the error be some other text error message? If an error is returned, why would my server log a 200 code?

ErrorDocument 403 /403.php
8:01 pm on Feb 12, 2017 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:Apr 9, 2011
posts:13737
votes: 458


If an error is returned, why would my server log a 200 code?

Previously I was thinking there are two scenarios that could result in a 200 when you wanted a 403. In fact there's a third--but it doesn't fit your circumstances. So let's see what I've overlooked.

#1 before the RewriteRule yielding a 403 [F], there is an earlier RewriteRule that rewrites certain requests to some different image file -- for example, rewriting unwanted-image-request.jpg to "nohotlinks.png"

#2 similar to the above, only this time rewriting to a php file that evaluates the request and issues a 403, meaning that the user gets a 403 while the server records a 200

#3 in a deeper directory, such as the one containing the images, there is an additional .htaccess with additional RewriteRules (with or without an Inherit option, leading to #3a and #3b which we needn't bother with yet)

ErrorDocument 403 /403.php

It is easy to verify that this rule is working as intended. Just make a request that you know isn't permitted, such as requesting a directory that has no index.html page. Do you get your custom 403 page? (This reminds me: make sure that one of your first RewriteRules says something like
RewriteRule ^(403\.php|nohotlinks\.png) - [L]
listing all error documents, along with anything that functions equivalently. You need this to prevent infinite loops.)
10:49 pm on Feb 12, 2017 (gmt 0)

Junior Member from CA 

Top Contributors Of The Month

joined:Feb 7, 2017
posts: 81
votes: 4


My site is set up as a subdomain, as in /domain/subdomain/, and not an addon domain. My htaccess is in public_html, as is the 403.php. The directory of the subdomain is public_html/subdomain. I have no other 404/403.phps in the /subdomain/. Within this file structure I found an additional htaccess within the subdomain directory, a standard one for Wordpress, installed with all new WP sites, and I have not futzed with it. Finding this small htaccess was a complete surprise to me.

I have no other rewrite redirects to a different image file, so #1 is out. #2 is possible, as I have a 403.php in public_html. And now #3 is possible. I have included the htaccess, standard for all WP installs. I don't understand the "<IfModule mod_rewrite.c>" .

I know, and tested, that there is no htaccess inheritance for addon domains for my host provider. I asked, they said "no", I did not believe them and tested, they were right. How does htaccess inheritance work for a subdomain? I do know that my htaccess protects my subdomain from banned IPs, and most UAs and referrers (sometimes does not work).

# BEGIN WordPress
<IfModule mod_rewrite.c>
RewriteEngine On
RewriteBase /xxx/
RewriteRule ^index\.php$ - [L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . /wp/index.php [L]
</IfModule>

# END WordPress

I did not anticipate this level of complexity. Thank you for your advice.
2:40 pm on Feb 13, 2017 (gmt 0)

Junior Member from CA 

Top Contributors Of The Month

joined:Feb 7, 2017
posts: 81
votes: 4


Today my server is a bit schizophrenic, logging both 403s and 200s for the same referer, *.blogspot.*, and same request. The 403s return 637b and the 200s return 435b. This is true for *.blogspot.co.id and *.blogspot.de. I only get 200s for *.blogspot.sg, but there are only 2 requests. In chronological time order it randomly flicks between 403s and 200s. I cannot seem to find any pattern. At least for the IPs that repeat, they get a consistent return code, either all 403s or all 200s.

Is there some machine learning plugin that Apache has deployed and not told anyone? :)

107.167.112.188 [13/Feb/2017:01:08:26 GET /stuff/*beetle.jpg HTTP/1.1 200 435 http://*.blogspot.de/2013/08/updated-kaget-banget-lipstick-baruku.html?m=1 Opera/9.80 (Android; Opera Mini/21.0.2254/37.9389; U; id) Presto/2.12.423 Version/12.16
168.235.201.61 [13/Feb/2017:01:45:04 GET /stuff/*beetle.jpg HTTP/1.1 403 637 http://*.blogspot.de/2013/08/updated-kaget-banget-lipstick-baruku.htmlMozilla/5.0 (Linux; U; Android 5.0.2; en-US; Redmi Note 3 Build/LRX22G) AppleWebKit/534.30 (KHTML, like Gecko) Version/4.0 UCBrowser/10.7.2.645 U3/0.8.0 Mobile Safari/534.30
124.153.33.7 [12/Feb/2017:23:05:46 GET /stuff/*beetle.jpg HTTP/1.1 403 635 http://*.blogspot.co.id/2013/08/updated-kaget-banget-lipstick-baruku.html Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/56.0.2924.87 Safari/537.36
118.96.234.231 [13/Feb/2017:02:02:46 GET /stuff/*beetle.jpg HTTP/1.1 200 435 http://*.blogspot.co.id/2013/08/updated-kaget-banget-lipstick-baruku.html Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/56.0.2924.87 Safari/537.36

At least I am getting some 403s, which is the goal. This is the first time in over a year.

[edited by: phranque at 2:48 am (utc) on Mar 14, 2017]
[edit reason] unlinked urls for clarity [/edit]

4:18 pm on Feb 13, 2017 (gmt 0)

Administrator from US 

WebmasterWorld Administrator not2easy is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Dec 27, 2006
posts:3275
votes: 160


UAs and referrers (sometimes does not work)

UAs and referrers are often not what/who they say they are so they aren't going to always be effective.
5:13 pm on Feb 13, 2017 (gmt 0)

Junior Member from CA 

Top Contributors Of The Month

joined:Feb 7, 2017
posts: 81
votes: 4


When I say UAs and referrers often do not work, I mean that when I see them in my log and set up a rewrite condition to ban them, the condition is often ignored, or the referrer can somehow subvert the condition.

UAs and referrers are often not what/who they say they are so they aren't going to always be effective

I realize that UAs and referrers can be made up and changed to whatever.
6:27 pm on Feb 13, 2017 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:Apr 9, 2011
posts:13737
votes: 458


My site is set up as

Uhmm... Some of the information in this post would have been awfully useful to have had all along, so we could give more accurate answers.

Some basics:

htaccess is based strictly on physical directory structure. Aliases don't apply. (For example: Years ago I had one directory on my personal site aliased to my son's userspace--different directory, possibly even different server--so he could edit my game files. A side effect was that my own htaccess had no effect on requests for material in this directory. Took me ages to figure out why unwanted robots were still getting in, while they were blocked everywhere else.)

Logs, on the other hand, typically go strictly by hostname. If you've spent time looking at stats (such as analog stats or piwik), all your requests for /stats/ or similar will show up in logs even when the data lives in an entirely different physical directory.

With rare exceptions, all htaccess directives are inherited: once you've said "DirectoryOptions -Index" or "Allow from all" or what-have-you, these rules stay in effect no matter how many additional htaccess files the request encounters, so long as the rule isn't superseded by a different rule (for example, you might permit auto-indexing in specific directories while disallowing it everywhere else).

The big, big exception is that mod_rewrite is not inherited. If more than one htaccess along the same path contains RewriteRules, only the last (deepest) one will be used; the others will be discarded as if they had never existed. Even if you say RewriteOptions inherit (assuming Apache 2.2), or some of the new options available in Apache 2.4, inheritance doesn't behave the way it does in other mods. In no case will more than one RewriteRule, from different htaccess files, be applied to the same request. (Remote possibility of exceptions if a rule doesn't have the [L] flag, whether implied or explicit, but this is wildly unlikely.)

This fact becomes crucial in WordPress (and several other popular CMS) since the whole thing is built on mod_rewrite. Aside: The <IfModule> envelope is normally pointless--either you've got the module or you haven't, and if you haven't got it, WordPress won't work--but don't edit anything inside the named WordPress section. Leave it strictly alone, beginning and ending with the # WP comment lines.

The UA and referer you see in your raw logs are the same UA and referer that mod_rewrite (and the rest of Apache) sees. So for now that's a non-issue.

Edit: Dammit, WebmasterWorld, I would be absolutely thrilled to fix my Style Code tags ... if only you would give me some hint where "position 9" is. Grr.
7:22 pm on Feb 13, 2017 (gmt 0)

Junior Member from CA 

Top Contributors Of The Month

joined:Feb 7, 2017
posts: 81
votes: 4


Ok, Thanks Lucy. I guess I just did not know the directory structure info was important. I apologize.

Threw:

RewriteCond %{HTTP_REFERER} blogspot [OR]
RewriteCond %{HTTP_REFERER} ^http://(.+\.)?treponregos [NC]
RewriteRule (jpe?g|gif|bmp|png)$ - [F]

into my WP directory's htaccess. This info would not have been inherited from PUBLIC_HTML to the /subdomain/. I'll see what happens.

Note: "^http://(.+\.)?" I will fix this unnecessary fluff.
10:21 pm on Feb 13, 2017 (gmt 0)

Administrator from US 

WebmasterWorld Administrator not2easy is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Dec 27, 2006
posts:3275
votes: 160


You know, if you have several distinct referrers to block, you could use a simpler format:
SetEnvIf Referer (blogspot|treponregos)$ trash

Then in your deny list you can add:
deny from env=trash


PS If you add lines to the htaccess file where your WP section is, make sure the additional lines are before that WP part, not after.
11:00 pm on Feb 13, 2017 (gmt 0)

Junior Member from CA 

Top Contributors Of The Month

joined:Feb 7, 2017
posts: 81
votes: 4


I am actively researching the SetEnvIf statements now. They seem to function the same, Lucy seems to like them, and they seem to be inherited by subdirectories. This would mean I would not have to duplicate rewrite rules for subdirectories. I maintain 5 sites on my host account, and wanted to maintain only one htaccess. This has not turned out as I had planned.

PS If you add lines to the htaccess file where your WP section is, make sure the additional lines are before that WP part, not after.

It is good that you mention this, as I already put additional rewrite rules after the WP part. Why is it better to add additional rewrite rules before the WP part?

Thanks for all these tips. I feel I am running less blind with my htaccess.
11:08 pm on Feb 13, 2017 (gmt 0)

Moderator from US 

WebmasterWorld Administrator keyplyr is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Sept 26, 2001
posts:8603
votes: 381


RE: SetEnvIf

I'm a big believer in loading as few server modules as necessary. If you have mod_rewrite already loaded, I suggest using that to block.
7:01 am on Feb 14, 2017 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:Apr 9, 2011
posts:13737
votes: 458


If you're on shared hosting, all modules are going to get loaded no matter what your htaccess actually uses. Too often, mod_rewrite is like shooting flies with an elephant rifle. There's a reason the apache docs are always urging you to use some other mod (even if it's completely unrealistic for most people in most situations).

But the huge advantage of using something other than mod_rewrite* is that you can put it in the highest possible directory (such as your userspace or "primary" domain, depending on host's preference) and it will all trickle down to everywhere else. In my case, I've got a shared htacess in my userspace that mainly deals in mod_setenvif and mod_authzwhateveritis, handling access control for all domains. Then, each individual domain has a site-specific htaccess that mainly does mod_rewrite, plus some things like ErrorDocument directives that may not be the same everywhere, or Index options that don't work properly outside the site context.

* I have to come up with an acronym for this. MOTR?
2:03 pm on Feb 14, 2017 (gmt 0)

Junior Member from CA 

Top Contributors Of The Month

joined:Feb 7, 2017
posts: 81
votes: 4


Added this to my WP htaccess, a subdir of my public_html htaccess
RewriteCond %{HTTP_REFERER} blogspot [OR]
RewriteCond %{HTTP_REFERER} treponregos [NC]
RewriteRule (jpe?g|gif|bmp|png)$ - [F]

Again, server logged half 403s (638bytes) and half 200s (435 bytes), and I don't know why. Total 21 server log entries
120.188.76.233 [13/Feb/2017:13:21:15 GET /stuff/*beetle.jpg HTTP/1.1 403 637 [*.blogspot.de...] Mozilla/5.0 (Linux; U; Android 5.1.1; en-US; F1f Build/LMY47V) AppleWebKit/534.30 (KHTML, like Gecko) Version/4.0 UCBrowser/11.2.0.915 U3/0.8.0 Mobile Safari/534.30
180.248.22.157 [14/Feb/2017:01:21:07 GET /stuff/*beetle.jpg HTTP/1.1 200 435 [*.blogspot.de...] Mozilla/5.0 (Linux; U; Android 4.4.2; en-US; E1C Pro Build/KOT49H) AppleWebKit/534.30 (KHTML, like Gecko) Version/4.0 UCBrowser/11.0.0.828 U3/0.8.0 Mobile Safari/534.30
103.47.133.0 [13/Feb/2017:12:22:35 GET /stuff/*beetle.jpg HTTP/1.1 403 635 [*.blogspot.co.id...] Mozilla/5.0 (Linux; Android 6.0.1; SAMSUNG SM-A800F Build/MMB29K) AppleWebKit/537.36 (KHTML, like Gecko) SamsungBrowser/4.0 Chrome/44.0.2403.133 Mobile Safari/537.36
64.233.173.10 [13/Feb/2017:21:50:13 GET /stuff/*beetle.jpg HTTP/1.1 200 435 [*.blogspot.co.id...] Mozilla/5.0 (Linux; Android 5.1.1; A37f Build/LMY47V) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/55.0.2883.91 Mobile Safari/537.36

I think it is time to try the SetEnvIf in my main htaccess, which should be inherited by my subdom WP htaccess:
SetEnvIf User-Agent "blogspot " keep_out
Order Allow,Deny
Allow from all
Deny from env=keep_out
4:44 pm on Feb 14, 2017 (gmt 0)

Junior Member from CA 

Top Contributors Of The Month

joined:Feb 7, 2017
posts: 81
votes: 4


...Too often, mod_rewrite is like shooting flies with an elephant rifle. There's a reason the apache docs are always urging you to use some other mod (even if it's completely unrealistic for most people in most situations).

Hi Lucy, I get the inheritance aspect of SetEnvIf, and I like it, but don't understand why Mod Rewrite is overkill. Does it use more server resources, is there more risk, is it slower, can more things break, are there more unintended consequences?

To me SetEnvIf and mod rewrite are like sliced bread and a bagel: Both are bread, taste good, can be toasted, but if you like poppy seed sprinkles you'll get more on a bagel.
7:34 pm on Feb 14, 2017 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:Apr 9, 2011
posts:13737
votes: 458


Pro tip: If you use [ code ] tags instead of [ quote ], auto-linking won't happen.

Edit: Those log entries are odd. Identical requests, except from different IPs, leading to either 200 or 403? That would make me strongly suspect that the "blogspot" rule isn't getting evaluated at all, and instead some of the requests are blocked because they come from IPs that are listed elsewhere in "Deny from" directives.

If it's your own server, you can look at RewriteLogs for more information. If nothing else, you'll see if the code is even executing. Error logs won't help, as they never say anything but "Request denied by server configuration". (Thanks, Apache. Very helpful.)
7:48 pm on Feb 14, 2017 (gmt 0)

Junior Member from CA 

Top Contributors Of The Month

joined:Feb 7, 2017
posts: 81
votes: 4


Hey Lucy,
I am able to block a few of their IPs with deny from statements, but they come up as 500s. They use so many IPs that very few repeat. I do track their IPs, in the hopes of a pattern, but Indonesia/Malaysia has a lot of phones and use a lot of IPs.

I'm on a shared server, as you have already guessed, so have no access to rewritelogs. All I have is the raw access log and the error log (lots of request denied by server configuration).
3:41 am on Feb 15, 2017 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:Apr 9, 2011
posts:13737
votes: 458


I am able to block a few of their IPs with deny from statements, but they come up as 500s.

Er... That certainly shouldn't be happening as a result of an ordinary "Deny from" statement. Can you give an example?

Never waste your time blocking individual IPs. Look for whole ranges that you can block as /20 or /12 or whatever it may be. (I never go lower than /24 which is equivalent to aa.bb.cc) There's an ongoing Server Farms thread next door in the SSID subforum.
4:38 am on Feb 15, 2017 (gmt 0)

Junior Member from CA 

Top Contributors Of The Month

joined:Feb 7, 2017
posts: 81
votes: 4


The funny thing is that I've been trying to kill these hotlinkers for almost a year and I don't know what I can and cannot do, so I try everything. I have been tracking their IPs, trying to find patterns, and when IPs coalesce I accumulate ranges.
112.215.174.196 [14/Feb/2017:05:27:22 GET /stuff/*rash2.jpg HTTP/1.1 500 - http://2*.blogspot.co.id/2012/08/miliaria.html?m=1
114.121.237.179 [13/Feb/2017:13:33:20 GET /stuff/*rash2.jpg HTTP/1.1 500 -http://2*.blogspot.co.id/2012/08/miliaria.html?m=1
8.37.225.89 [14/Feb/2017:00:58:23 GET /stuff/*beetle.jpg HTTP/1.1 500 - http://*.blogspot.de/2013/08/updated-kaget-banget-lipstick-baruku.html

deny from 112.215.170.0/23 112.215.172.0/22
deny from 114.121.232.0/21
deny from 8.37.225.0/24

If I see 2 ip addresses with a common a.b.c. I'll ban a.b.c.0/24. If I notice a.b.c+1.0/24, I'll ban a.b.c.0/23. If a.b.c+2.0/24 and a.b.c+3.0/24 comes along I'll ban a.b.c.0/22. I have no other solution than to ban IPs because none of my rewrite conditions have any effect. I am ecstatic that I got a 403 to work yesterday, the first in almost a year.
4:43 am on Feb 15, 2017 (gmt 0)

Moderator from US 

WebmasterWorld Administrator keyplyr is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Sept 26, 2001
posts:8603
votes: 381


TorontoBoy - Blocking individual user's IP addresses is pointless. They are not the source.

You were given the correct code to block requests referring from blogspot.
8:50 pm on Feb 15, 2017 (gmt 0)

Junior Member from CA 

Top Contributors Of The Month

joined:Feb 7, 2017
posts: 81
votes: 4


PS If you add lines to the htaccess file where your WP section is, make sure the additional lines are before that WP part, not after.
Thanks for the tip Not2Easy.
Why is it better to add additional rewrite rules before the WP part?
The reason is that WP's htaccess modifies the incoming URL, and then executes it. If you add your rewrite rules after the WP part, the incoming url will be modified and executed before it gets to your additional rewrite rule, meaning your rewrite rule will never be executed.
3:59 pm on Feb 16, 2017 (gmt 0)

Junior Member from CA 

Top Contributors Of The Month

joined:Feb 7, 2017
posts: 81
votes: 4


SetEnvIf was my magic bullet, slaying all of *.Blogspot.* for my 5 sites. I did not understand Mod_Rewrite inheritancebetween subdirectories (there is none), though all my "deny from a.b.c." statements have been working. Deny from statements are not Mod-Rewrites. A partially successful htaccess had thrown me off.

I'll be slowly converting my Rewritecond statements over to SetEnvIf, while culling the growing forest of spammer UAs and Referers. SetEnvIf inherits to all my subdirectories, meaning that I don't have to touch the lower treed htaccess.

I am now a big believer in SetEnvIf, for its inheritance property. I still do not know why Mod_Rewrite is too powerful a tool, vs SetEnvIf. Other than the inheritance property they seem to be equivalent. The little nuances of htaccess are still critical.

Thanks all for your time and effort.
This 33 message thread spans 2 pages: 33