My thumbnails are in different directories so that makes it challenging. Also they all end in _sm.jpg. So is there a way to block images by ending such as _sm.jpg no matter in what directory they are located or if not, is it possible to block images by size, for example if image is less than 5kb block it?
tangor
8:41 pm on Sep 14, 2017 (gmt 0)
robots.txt only provides suggestions to crawlers. It has no teeth to prevent bad actors ignoring those suggestions.
What you want can be done in .htaccess
keyplyr
8:45 pm on Sep 14, 2017 (gmt 0)
First, doing anything via robots.txt will only affect the agents who support robots.txt. So, all the other million or so agents that do not support robots.txt directives, will continue to do what they want, unless managed with alternative solutions.
But if you are only concerned with Google, Bing and Yandex, you could use a wildcard disallow:
Disallow: *_sm.jpg
virtualreality
10:11 pm on Sep 14, 2017 (gmt 0)
Thank you both for your replies. Is the .htaccess solution better then and if so how can I configure it?
not2easy
10:32 pm on Sep 14, 2017 (gmt 0)
The htaccess solution simply adds a noindex metatag to all files in a folder. Given that you mentioned that the images are in several folders, this robots.txt disallow might serve you better.
Two things to help you decide which way you should go: 1. If you disallow images in robots.txt, they may consider those as "Blocked Resources" and claim that they can't determine whether a page where those Blocked Resources are used is Mobile Friendly or not.
2. If you use the X-Robots header method via .htaccess, you could block all files in that folder from indexing. It would not prevent crawling that folder - but more importantly, rewrite rules in root htaccess file might not execute in those folders with an additional htaccess file.
virtualreality
10:51 pm on Sep 14, 2017 (gmt 0)
Thank you, not2easy!
lucy24
11:06 pm on Sep 14, 2017 (gmt 0)
rewrite rules in root htaccess file might not execute in those folders with an additional htaccess file.
That is: if and only if the additional htaccess file also contains RewriteRules without an "inherit" directive. If the sole content of your supplementary htaccess files is to set a noindex header, there is no risk.