Forum Moderators: phranque
I run a site on lets say adomain.com and I'm finding lots of 404's for non-existant files requested by users with referrers for a different site.
For example the error will be on the URL:
/mediastore/Design/football365/menu/teams.gif
Whilst the referrer is:
h**p://delb.myspace.com/html.ng/site=myspace&position=leaderboard&page=11011001&rand=4719568640&acnt=1
Another example is:
/totalbet/images/tb_scroller_foot.gif
The referrer is:
h**p://www.redcolobus.com/totalbet/tb_fb365_fbtb.html
(** so that they are not treated as links)
I suspect a DNS error for these users is leading them to think that the site they are looking for is on my IP. Certainly if so there is little I can do about it.
1) What are you thoughts on why these errors/referrers might occur?
They are getting very numerous.
I'd like a rewrite rule to redirect users to a low bandwidth page with a simple message telling them that they possibly have a DNS error, whilst retaining the custom 404 pages for user that don't match the criteria above. I am adapt at rewrite rules on urls but not in examining the REQUEST segment.
2) Can someone please help me create a rewrite rule to match when a relative URL is used AND the referrer is not on the sites domain?
Would this set be a good start?
RewriteCond %{THE_REQUEST}!^[A-Z]+\ /?https?://(?:www\.)?adomain\.com [NC]
RewriteCond %{HTTP_REFERER}!^$ [OR]
RewriteCond %{HTTP_REFERER}!https?:\\(?:www\.)?adomain\.com\.*$ [NC]
RewriteRule .* http://www.adomain.com/message.html [NC,L] Have I got my logic right? Can it be improved? Do I need AND or OR's in there?
Many thanks in advance for any pointers.
MattyUK
[edited by: jdMorgan at 5:52 am (utc) on Feb. 23, 2006]
[edit reason] de-linked. [/edit]
The value saved in {THE_REQUEST} is the request header sent from the client browser. For example:
GET /images/some_image.gif HTTP/1.1
Note that {THE_REQUEST} *does not* contain "http://" followed by a domain name; That information is passed in a separate (new-for-HTTP/1.1) "Host" header, available to mod_rewrite in the {HTTP_HOST} variable.
I'm not clear on the mechanism resulting in these requests. Have you examined any of the referring pages to determine if this is just a simple hotlinking case? If not, then your DNS theory sounds likely.
If the requests are always for images, and your have a limited number of known referrers, then an effective solution should be possible. Please post some more info.
Jim
Thank you. Sorry it took a while for me to get back to this. I thought I had it set to email me on reply, I guess I didn't.
Yes I checked, no hotlinking is evident. At least when the source code for the referring page is examined, the domain nor server ip is anywhere in it.
The two examples I posted were for images but the requests span pages and images from memory. Mostly images. Pages are very occasional. I'd be happy for images to be broken and pages to show my message page. might be nice if the rule detected which was which and directed to an image/page accordingly. I guess once I have the rule logic it'll be easy enough to adjust.
I do have hotlinking protection in the form of that allows blank referrers and a small list of allowed referrers otherwise it serves a 'no hotlink' image.
I was swamped with requests from China a short while back. They seemed to think the server was a proxy when it wasn't and I used this rule to direct them to a page telling them 'no'.
#no proxy
RewriteCond %{THE_REQUEST} ^[A-Z]+\ /?https?:// [NC]
RewriteCond %{THE_REQUEST} !^[A-Z]+\ /?https?://(?:[^.]+\.)?adomain\.co\.uk [NC]
RewriteRule .* http://www.adomain.co.uk/ad_dns_error_pr0xy_attempt.html [NC,P,L] Does your point about THE_REQUEST and HTTP_HOST affect this defense?
I'm eager for your feedback. Should I post any other information? Thank you in advance.
MattyUSA
[edited by: jdMorgan at 5:51 am (utc) on Feb. 23, 2006]
[edit reason] de-linked. [/edit]