Welcome to WebmasterWorld Guest from 54.224.91.58

Forum Moderators: goodroi

Message Too Old, No Replies

Is there a way to block thumbnails in robots.txt?

     
8:19 pm on Sep 14, 2017 (gmt 0)

Full Member

5+ Year Member

joined:June 26, 2008
posts:265
votes: 5


My thumbnails are in different directories so that makes it challenging. Also they all end in _sm.jpg. So is there a way to block images by ending such as _sm.jpg no matter in what directory they are located or if not, is it possible to block images by size, for example if image is less than 5kb block it?
8:41 pm on Sept 14, 2017 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member tangor is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Nov 29, 2005
posts:8164
votes: 608


robots.txt only provides suggestions to crawlers. It has no teeth to prevent bad actors ignoring those suggestions.

What you want can be done in .htaccess
8:45 pm on Sept 14, 2017 (gmt 0)

Moderator from US 

WebmasterWorld Administrator keyplyr is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Sept 26, 2001
posts:11443
votes: 686


First, doing anything via robots.txt will only affect the agents who support robots.txt. So, all the other million or so agents that do not support robots.txt directives, will continue to do what they want, unless managed with alternative solutions.

But if you are only concerned with Google, Bing and Yandex, you could use a wildcard disallow:

Disallow: *_sm.jpg
10:11 pm on Sept 14, 2017 (gmt 0)

Full Member

5+ Year Member

joined:June 26, 2008
posts:265
votes: 5


Thank you both for your replies. Is the .htaccess solution better then and if so how can I configure it?
10:32 pm on Sept 14, 2017 (gmt 0)

Administrator from US 

WebmasterWorld Administrator not2easy is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Dec 27, 2006
posts:3719
votes: 205


The htaccess solution simply adds a noindex metatag to all files in a folder. Given that you mentioned that the images are in several folders, this robots.txt disallow might serve you better.

Two things to help you decide which way you should go:
1. If you disallow images in robots.txt, they may consider those as "Blocked Resources" and claim that they can't determine whether a page where those Blocked Resources are used is Mobile Friendly or not.

2. If you use the X-Robots header method via .htaccess, you could block all files in that folder from indexing. It would not prevent crawling that folder - but more importantly, rewrite rules in root htaccess file might not execute in those folders with an additional htaccess file.
10:51 pm on Sept 14, 2017 (gmt 0)

Full Member

5+ Year Member

joined:June 26, 2008
posts:265
votes: 5


Thank you, not2easy!
11:06 pm on Sept 14, 2017 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:Apr 9, 2011
posts:14707
votes: 613


rewrite rules in root htaccess file might not execute in those folders with an additional htaccess file.

That is: if and only if the additional htaccess file also contains RewriteRules without an "inherit" directive. If the sole content of your supplementary htaccess files is to set a noindex header, there is no risk.
 

Join The Conversation

Moderators and Top Contributors

Hot Threads This Week

Featured Threads

Free SEO Tools

Hire Expert Members