Forum Moderators: phranque
In my efforts to put a stop to it, I was directed to this thread:
[webmasterworld.com...]
After trying every variation I could think of on each of the suggestions there, making whatever substitutions in the code I could figure out, given my unfortunately limited knowledge of the subject (I've been running a site for two years, but can still barely make myself understood in simple HTML), I finally got positive results with the following:
Options +FollowSymlinks
RewriteEngine on
RewriteCond %{HTTP_REFERER} !^http://(www\.)?mydomain\.com [NC]
RewriteRule \.(gif¦jpeg?¦jpg)$ - [F]
(except with solid pipes rather than broken ones in the last line)
Later in the thread I note that this may give search engines a problem. But since I don't speak the language well enough, I'm not absolutely certain what the problem will be.
Does this mean Google (for example) will not be able to update its knowledge of my content?
That would be a very hard price to pay -- none of my new articles making it into search engines. But it was SO hard for me to get ANYTHING working, that I'm reluctant to mess with it.
Of course, I will if I have to. The question is -- do I have to?
Thanks to any who can help.
Welcome to WebmasterWorld [webmasterworld.com]!
This Introduction to mod_rewrite [webmasterworld.com] tutorial and the resources that it cites may help you to get up to speed on mod_rewrite and redirects in general.
The "problems with search engines" you're worried about are not with search engines per se, but rather with the page caching and translation facilities that many of them provide. You might want to go try a few of them; Google's cached web pages are available from their search results and AltaVista's "Translate" feature is available from their home page. Then you can decide whether those features are useful to your visitor-base.
The block you have in place will not stop Googlebot from indexing your HTML pages. It will stop Googblebot-Image from indexing your images, but that is usually not a problem unless you want your images to appear in Google Image Search.
I would advise you add at least the first missing line to your rewrite (from the thread you cited). Otherwise, you will block image access by any visitors who access the 'net from behind a caching proxy - most medium-to-large corporations, for example.
The line I'm talking about is this one:
RewriteCond %{HTTP_REFERER} !^$ I would encourage you to learn as much as you can about the powerful tools available to you on your server like mod_rewrite. They will ultimately make your life easier, even if the initial learning curve is a bit steep. :)
HTH,
Jim
This file as it stands right now will NOT prevent Google etc. from updating its index of my site. So surfers WILL be able to find my new articles a week or so after they go up (which I think is about how long Google takes to list them).
But it WILL prevent their caching it, and their reproducing my images.
Cool! I definitely want them to index me, don't particularly care if they cache me, and don't want them displaying my images. So it's going to do exactly what I want!
Now -- I've just looked over the thread again, and don't see that line in the code I based my file on. So, you're saying I should add it -- where? AFTER the line "RewriteEngine on" and BEFORE the line that contains my domain name, correct?
You ain't just whistlin' Dixie about the steep initial learning curve. My head is swimming! But I followed your link, and a link there to apache.com, and bookmarked the article on Module_mod rewrite for future study. I'll get there! Too bad there isn't a book called .htaccess for Dummies, but I can certainly see the value in taking the trouble.
As Valentine Michael Smith said, I am only an egg... And I really appreciate it when you full-grown sentients help me out with this stuff.
Confirmed. This will not affect indexing of non-image-type resources.
The order of RewriteConds is not criticial in this case, just use this post [webmasterworld.com] as a template, and leave out the RewriteConds you don't want.
Re VMS quote: Time enough for mod_rewrite... ;)
Jim
One copy of images is an original, the second copy are the same images but with your stamp or a watermark in the corner.
Something like:
www.MySite.com
Item: #1234
Then configure mod_rewrite to send the original to the requests that specify your site as a referrer and the second copy to anything else.
Also, configure it to send the original to the googlebot if you want it in the images and froogle sections.
That way, most of your site visitors will see a correct image.
Those that don't send referers at all, will still see something.
And the rest... well, I love free advertising.
Make sure your site search supports search by item numbers, just for this.
Also, use mod_expiration (or expires don't remember the name) to set expiration header for the images to something like 2 weeks.
That way, you save on bandwidth and the original image stealers don't see your watermarks in their weblogs, but other people do. So the authors of the posts with your images don't even realise they are promoting your site.