Forum Moderators: phranque
It seems pron sites are setting themselves up in the blogspot.com domains owned by Google.
The format of the referrers is keyword1-keyword2.blogspot.com
The IP for the blogspot.com domains resolves to 66.102.15.101 which is Google.com
I don't want to ban google from my sites with a
Deny from 66.102.15.101
Perhaps a wildcard format for the .htacess in the form of *.blogspot.com? Such as this. But is it proper?
RewriteCond %{HTTP_REFERER} ^http://(www.)?*.blogspot.com(/)?.*$ [OR]
Any help would be greatly appreciated.
Those log spam entries are left by an automatic bot. This bot will make its rounds whether you serve it a status 200, 403, or anything else. You can fry your brain about the right .htaccess syntax, but I don't think you'll acheive anything useful that way in such a case.
Many website owners have referrer pages to keep track of where incoming links originate.
I prefer not to go into detail and spread the technique of this particular method.
If you can help, I would appreciate it.
[edited by: jdMorgan at 1:12 am (utc) on Feb. 9, 2004]
[edit reason] speling [/edit]
RewriteCond %{HTTP_REFERER} ^http://(www\.)?([a-z0-9\-]+)\.blogspot\.com(/)? [NC]
1) The "OR" condition implies you have other conditions, and want any "true" condition to trigger failure. Without seeing surrounding lines, I can't see if it's appropriate.
2) You need to match the sub-domain. I make the assumption a sub-domain is alphanumeric.
3) Put a backslash before periods, if you literally mean a period (not a wildcard).
4) Domain names can appear as upper or mixed case, so make no assumptions, and use NC (with or without OR).
Also, for it to work, you obviously have to have an appropriate RewriteRule.
WARNING: AltaVista's scooter spider (and maybe others) provide referer informatin in GET requests. You could block such a spider with the above code.
Also, you should track the ip of the "bad" visitors (not the domain, NOT GOOGLE). Then, if it's always the same, find out who owns it, and consider blocking that specific ip.
RewriteCond %{HTTP_REFERER} ^http://.*blogspot\.com [NC,OR] Jim
Proper syntax derives from proper semantics, which means it depends on the purpose of what you're trying to do. There's not really any proper blocking syntax for a problem that can't be solved by blocking.
Many website owners have referrer pages to keep track of where incoming links originate.
There are no special pages needed to keep track of incoming links. What you probably mean are pages where those referring links are displayed for the general public to see. Those typically work based on a SSI script, which will indeed be circumvented if you block the respective page from loading. But do you really want to block real visitors just because they happen to come from some (legitimate) blog link? I assume that the spammers are still a tiny minority among the blogspot users.
You can't reliably protect your autogenerated links. The spammy blogspot referrers of today will be replaced by a dozen other domains tomorrow, and a hundred more next week. Keeping track of them all is not worth the effort, just for the dubious benefit of those bragging referrer displays. The semantical answer therefore really is not do display your referring links on your site. They are of no interest to your visitors anyway. If you think that your visitors should know about some of those sites, place a normal link to them somewhere. It's a lot easier to manage a few positive examples than to weed out the spam.
If you can help, I would appreciate it.
jdMorgan gave the pattern that I consider the most effective technically.