Msg#: 4625945 posted 1:09 am on Nov 26, 2013 (gmt 0)
The quoted lines will block all bots from seeing .jpg images. (But not .JPG, .jpeg et cetera, let alone .png or other formats.) You don't need FilesMatch, though; for a single extension, a simple <Files ".jpg"> will do. The quoted lines will also block all other requests for .jpg files from all users at all times under all circumstances.
Inquiring minds want to know: If nobody is ever allowed to see them, why has the site got .jpg files in the first place?
Msg#: 4625945 posted 4:58 pm on Nov 27, 2013 (gmt 0)
Thanks for you answer.
I was looking for a way to block bots from seeing the images only. I saw that .htaccess info at the wordpress forum in a similar discussion but wasn't certain it would work. Apparently it would work too well and nobody could see the images.
Wondering if there is a way to just block bots. robots.txt of course but that can easily be ignored
Msg#: 4625945 posted 9:21 pm on Nov 27, 2013 (gmt 0)
A robot that ignores explicit, correctly worded robots.txt directives has earned the right to an unconditional ban. But with images, unlike text, it's normally enough to put robots.txt blocks on the relevant directories. You can also exclude any single-purpose robots such as Googlebot-Image or msnbot-media.
Question whose answer I don't know: Is there an image-search equivalent to "robot.txt prevents us from showing a snippet"? For example if you search for "pictures of widgets", where widgets are something extremely rare. If there's a page with lots of text about widgets but the images themselves are roboted-out, would the searcher see anything? I'm inclined to think that if the robot hasn't seen the image, its existence won't even be mentioned in search results.
Msg#: 4625945 posted 1:37 am on Nov 28, 2013 (gmt 0)
You can no-index images using an x-robots header in htaccess, it does not prevent robots from crawling images but they will not show up in image searches of mainstream search engines. That doesn't mean they can't be indexed, and doesn't mean they won't be scraped and indexed from somewhere else.