Forum Moderators: open
a very strange referrer in my logs, is it a kind of spider?
"http://www.widgets.com/widget.html" is one of my pages that I find many times in my logs as it links to many of my other pages.
But this time I found the following in my logs as
Referrer:
"wysiwyg://25/http://www.widgets.com/widget.html"
came with agent:
Mozilla/4.79 [en] (Windows NT 5.0; U)
from IP:
4.20.190.2
looked it up and 4.20.190.2 belongs to:
Mid-Columbia Library District MIDCOLIBDIS-190-21
thanks,
Welcome to WebmasterWorld [webmasterworld.com]!
It's a Netscape 4x browser loading javascript or image links.
Jim
so then this guy is not getting my images then.
I had just added the images-hotlinking code in HTaccess using mod_rewrite and this was the first one I saw in my logs that got served a replace image called "hot.gif" instead of all the normal images as he was not referred by my own website: [widgets.com...]
Are there many of these about?
Thanks,
No, this is the only "weird" case I know of. Look for wysiwyg:// followed by a one- or two-digit number in addition to the Mozilla/4.79 user-agent. Something like this:
RewriteCond %{HTTP_USER_AGENT} !^Mozilla/4 [OR]
RewriteCond %{HTTP_REFERER} !^wysiwyg://[0-9]{1,2}/http://www\.yourdomain\.com [NC]
RewriteCond %{HTTP_REFERER} !^http://(www\.)?yourdomain\.com [NC]
RewriteRule \.(gif¦jpe?g¦png¦bmp)$ /hot.$1 [L]
I have it as follows:
RewriteEngine on
RewriteCond %{HTTP_USER_AGENT} ^BlackWidow [OR]
. . . big list of spiders and bots here. . . .
RewriteCond %{REQUEST_URI} FormMail.*
RewriteRule /*$ [987gotothere784.com...] [L,R]
RewriteCond %{HTTP_REFERER}!^$
RewriteCond %{HTTP_REFERER}!^http://(www\.)?widgets.com(/)?.*$ [NC]
RewriteRule .*\.(gif¦jpg¦jpeg¦bmp)$ [widgets.com...] [R,NC]
with "widgets.com" being my web site and "hot.png" being a replacement banner saying "welcome to visit "widgets.com"
So that is the reason that the referrer mentioned in the message above got served a replacenebt picture.
And what about changing:
"http://widgets.com/hot.png"
into something like:
<a href="http://widgets.com"><img border="0" src="http://widgets.com/hot.png"></a>
So that hotlinkers would be giving a link back with (maybe) some good PR instead of just using free bandthwidth.
Would this be possible and how to do?
Thanks
No, you're not going to be able to redirect to an html-format link. Also, you should not redirect from one file type (graphic) to another (html) - or even from one graphic type to another, if it is important to you that your alternate graphic always display as intended. Changing filetypes may confuse some browsers, and that defeats your purpose.
Jim
Thanks for your response,
I took the Hot-Linking code from the new Hot-Linking online code generator at: [htmlbasix.com...]
First in the tutorial they explained that as we are blokking gif jpg jpeg and bmp, the replacement image should be in another format (meaning not a format that we are blokking) as this could result in sending the server in a loop.
If we are blokking gif-format and then we replace by a gif format, then also this replacement image would get blocked and again trying to get replaced by the replacement gif which again would get blocked...etc...etc.
I am certainly not a specialist on mod_rewrite and tried many examples from this board, but non seemed to work.
Until I discovered this new tool that makes your Mod_Rewrite code online.
That's how I got this code and it seems to work very good
Should I change the Code?
Thanks,
PS: there is an error in the posted code, maybe the board changed it, but the last line should read:
RewriteRule .*\.(gif¦jpg¦jpeg¦bmp)$ [widgets.com...] [R,NC]
Simple mod_rewrite code to steer image requests to different image but correct file-type, and without looping:
RewriteCond %{HTTP_REFERER} !^http://(www\.)?yourdomain\.com [NC]
RewriteCond %{REQUEST_URI} !^/hot
RewriteRule \.(gif¦jpe?g¦png¦bmp)$ /hot.$1 [L]
A far simpler approach is to simply return a 403-Forbidden code to hotlink image requests:
RewriteCond %{HTTP_REFERER} !^http://(www\.)?yourdomain\.com [NC]
RewriteRule \.(gif¦jpe?g¦png¦bmp)$ - [F]
Jim
Thanks JdMorgan,
In reality I would not need bmp and png.
And where should I store those 3 replacement images, hot.gif, hot.jpg and hot.jpeg. Should this be in the root of my side (www.mysite.com/)?
Would your code then also protect all folder files of my site, as I have images stored in nearly all levels, such as:
mysite.com/sub1/map.gif
mysite.com/sub1/subsub2/artwork.jpg
mysite.com/logo.gif
etc...etc
Thanks,
Would your code then also protect all folder files of my site
nilloc,
htaccess in your root folder controls every folder and file below that folder in your directory structure.
UNLESS;
You have additional htaccess files in the folders below.
If the additional folder htaccess exists?
Than the root protects above that folder.
Additioanlly you may store replacement files any location you desire as long as you point to the prospective loaction.
So I would have to change JdMorgan's code as follows:
RewriteCond %{HTTP_REFERER}!^http://(www\.)?yourdomain\.com [NC]
RewriteCond %{REQUEST_URI}!^/hot
RewriteRule \.(gif¦jpg¦jpeg)$ [yourdomain.com...] [L]
and then store the 3 images in the root of my site as this:
www.yourdomain.com/hot.gif
www.yourdomain.com/hot.jpg
www.yourdomain.com/hot.jpeg
Is this correct?
You don't have to change my code at all. :)
Introduction to mod_rewrite [webmasterworld.com]
Jim
So I would have to change JdMorgan's code as follows:RewriteRule \.(gif¦jpg¦jpeg)$ [yourdomain.com...] [L]
nilloc,
If your storing the files in your root?
You shouldn't have to change anything.
"com/hot.$1"
This portion denotes your root and files that begin with hot
Have been thinking about something!
Now with my image hot linking working OK, am I not blocking my pages to be cached by Google and the image robots from Google and other search engines
And am I also not blocking the different translating programs from some spiders such as Altavista and google.
Thanks,
You are blocking only images, but you are indeed blocking them from the Google, AltaVista, and freetranslation translators and from all search engine caches. Your .html and other pages are not blocked, only images.
Now that you've got the simple versions above figured out, have a go at this one:
# Block linking from outside our domain except Google, Yahoo AltaVista, Gigablast,
# Comet Systems, SearchHippo, Wayback Machine, and freetranslation.com translators and caches
RewriteCond %{HTTP_REFERER} !^$
RewriteCond %{HTTP_REFERER} !^http://www\.[b]mydomain\.com[/b]
RewriteCond %{HTTP_REFERER} !^http://127\.0\.0\.3 <------------------------- This is the server's IP address
RewriteCond %{HTTP_REFERER} !^http://216\.239\.(3[2-9]¦[45][0-9]¦6[0-3])\..*www\.[b]mydomain\.com[/b]
RewriteCond %{HTTP_REFERER} !^http://66\.218\.(64¦[78][0-9]¦9[0-5])\.[0-9]{1,3}/search/cache.*www\.[b]mydomain\.com[/b]
RewriteCond %{HTTP_REFERER} !^http://babel\.altavista.com/.*www\.[b]mydomain\.com[/b]
RewriteCond %{HTTP_REFERER} !^http://216\.243\.113\.1/
RewriteCond %{HTTP_REFERER} !^http://search.*\.cometsystems\.com/search.*www\.[b]mydomain\.com[/b]
RewriteCond %{HTTP_REFERER} !^http://.*searchhippo\.com.*www\.[b]mydomain\.com[/b]
RewriteCond %{HTTP_REFERER} !^http://web\.archive\.org/web/.*/http://www\.[b]mydomain\.com[/b]
RewriteCond %{REMOTE_ADDRESS} !^207\.228\.(19[2-9]¦2[01][0-9]¦22[0-3])\.
RewriteCond %{HTTP_REFERER} !^http://fets\.freetranslation\.com.*[b]mydomain[/b]
RewriteCond %{HTTP_REFERER} !^wysiwyg://[0-9]{1,2}/http://www\.[b]mydomain\.com[/b]
RewriteRule \.(jpg¦jpeg?¦gif¦js¦css)$ - [F]
Jim
Thanks JpMorgan. . .
I think I understand that here you are letting the good spiders and caches fetch the images.
Very intereresting that you also block the JS and CSS files and so also serving a replacement file for them.
Let me try this one out tomorrow and then report back to you. Reason being I'm running out of dial-up hours and need to jump into town to fetch extra internet-hours.
One more Question, while it seems I am having a grand-master of Mod_rewrite teaching me here.
Let's say I am reoploading and changing a lot of stuff on my web site and because at that moment many links and pages would give a 404 (take a time frame of 30 min a 1 hour)
How would you write a temporay rule that ALL DEMANDS for any file on "www.mydomain.com" then go to a special page that is: "www.mydomain.com/temporary_ofline.htm"
Thanks,
RewriteCond %{HTTP_REFERER} !^$
...it seems to me, that you are actually allowing user-agents without a referring string to grab the pix.
An empty referrer string is something that could be good (=yourself and friends, knowing the direct URL and doing copy+paste)
-or it could be bad (a bot following a list of files/URLs)
As i see from page one (nilloc):
RewriteCond %{HTTP_USER_AGENT} ^BlackWidow [OR]
. . . big list of spiders and bots here. . . .
-some (many?) bots are probably being banned already. You should consider if the statement above (!^$) could be in conflict with other statements, and, if so, what sideeffects it might have.
You are correct, and this is the trade-off you must make. Blank referrers can come from bad-bots and harvesters, or they can be regular users who happen to be behind firewalls or proxies (home or corporate) or who are using products like Norton Internet Security.
I suggest each webmaster should make the decision for him/herself what UAs to allow or block.
nilloc,
I have posted the code as an example. I suggest you go through it and use the parts that apply to your situation.
In this code, I do not serve a replacement file. I simply return a 403-Forbidden response. For this site, it makes sense. For yours, maybe not.
Introduction to mod_rewrite [webmasterworld.com]
Jim