homepage Welcome to WebmasterWorld Guest from 184.73.52.98
register, free tools, login, search, subscribe, help, library, announcements, recent posts, open posts,
Subscribe and Support WebmasterWorld
Home / Forums Index / Code, Content, and Presentation / Apache Web Server
Forum Library, Charter, Moderators: Ocean10000 & incrediBILL & phranque

Apache Web Server Forum

    
Blocking Bots for Images
want to know for sure if this .htaccess line would block all bots
warner carter




msg:4625947
 12:49 am on Nov 26, 2013 (gmt 0)

Can someone more knowledgeable than me on .htaccess confirm that this would block all bots from seeing any .jpg images

<FilesMatch "\.jpg$">
order allow,deny
deny from all
</FilesMatch>

 

lucy24




msg:4625949
 1:09 am on Nov 26, 2013 (gmt 0)

The quoted lines will block all bots from seeing .jpg images. (But not .JPG, .jpeg et cetera, let alone .png or other formats.) You don't need FilesMatch, though; for a single extension, a simple <Files ".jpg"> will do.


The quoted lines will also block all other requests for .jpg files from all users at all times under all circumstances.

Inquiring minds want to know: If nobody is ever allowed to see them, why has the site got .jpg files in the first place?

warner carter




msg:4626375
 4:58 pm on Nov 27, 2013 (gmt 0)

Thanks for you answer.

I was looking for a way to block bots from seeing the images only. I saw that .htaccess info at the wordpress forum in a similar discussion but wasn't certain it would work. Apparently it would work too well and nobody could see the images.

Wondering if there is a way to just block bots. robots.txt of course but that can easily be ignored

topr8




msg:4626410
 6:34 pm on Nov 27, 2013 (gmt 0)

blocking bots can be very time consuming ... there is no simple one line solution

there is a whole forum dedicated to it here at WebmasterWorld

[webmasterworld.com...]

lucy24




msg:4626449
 9:21 pm on Nov 27, 2013 (gmt 0)

A robot that ignores explicit, correctly worded robots.txt directives has earned the right to an unconditional ban. But with images, unlike text, it's normally enough to put robots.txt blocks on the relevant directories. You can also exclude any single-purpose robots such as Googlebot-Image or msnbot-media.

Question whose answer I don't know: Is there an image-search equivalent to "robot.txt prevents us from showing a snippet"? For example if you search for "pictures of widgets", where widgets are something extremely rare. If there's a page with lots of text about widgets but the images themselves are roboted-out, would the searcher see anything? I'm inclined to think that if the robot hasn't seen the image, its existence won't even be mentioned in search results.

not2easy




msg:4626483
 1:37 am on Nov 28, 2013 (gmt 0)

You can no-index images using an x-robots header in htaccess, it does not prevent robots from crawling images but they will not show up in image searches of mainstream search engines. That doesn't mean they can't be indexed, and doesn't mean they won't be scraped and indexed from somewhere else.

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Code, Content, and Presentation / Apache Web Server
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About
© Webmaster World 1996-2014 all rights reserved