Welcome to WebmasterWorld Guest from 54.197.171.28

Forum Moderators: goodroi

Message Too Old, No Replies

robots.txt to block bots access to images

got to save bandwidth

   
11:27 am on Feb 14, 2003 (gmt 0)

10+ Year Member



Hi

I haven't used a robots.txt file before but I am now beginning to see the light.

My website is chewing up rather a lot of bandwidth mainly due to the large amount of images, many of which are quite large in file size.

What I need is a simple robots.txt file which prevents ALL bots from accessing my image directory. Would the following be sufficient?

User-agent: *
Disallow: /images/
Disallow: /gfx/

User-agent: Googlebot-Image
Disallow: /

and if so, great - but would it affect the way Google perceives the website - i.e. would the robots file disadvantage the site in any way. For example, it may be seen as being a smaller site in terms of webspace - and Google likes big websites... (probably being paranoid here).

thanks.

1:58 pm on Feb 14, 2003 (gmt 0)

WebmasterWorld Senior Member jdmorgan is a WebmasterWorld Top Contributor of All Time 10+ Year Member



Stavs,

The robots.txt code you posted will work - but only for robots which obey robots.txt.

If your problem is "brand-name" robots, like Googlebot, then your approach will work fine. If you are also being hit by many other 'bots with no name or unfamiliar names, then you may need to use a stronger method, such as access blocks by IP address or user-agent. How you do that depends on what server your sites are hosted on.

To save bandwidth, I blocked the majority of images on my sites too, and Google didn't seem to care; I didn't notice any "penalty" for doing so, if that's what you mean. I had a lot of image-based traffic that was not very useful to me or to the visitors - most would look and leave.

Jim

2:47 pm on Feb 14, 2003 (gmt 0)

10+ Year Member



Thanks JD - you've given me a few things to consider.

I suppose my next question is, given that some bots don't recognise (or take notice of) the robots.txt file, what would I achieve from using the file? Would it actually make any significant difference to my bandwidth usage?

3:00 pm on Feb 14, 2003 (gmt 0)

WebmasterWorld Senior Member jdmorgan is a WebmasterWorld Top Contributor of All Time 10+ Year Member



stavs,

Only a thorough analysis of the log files from your site(s) can answer the question of whether this will help or not.

As I stated, I use robots.txt to block robots which will obey it, and stronger methods for those which won't. A search for "hot-linking", "image blocking" and similar subjects here on WebmasterWorld will turn up quite a few threads which may be useful to you.

On some sites, the impact of SE spiders requesting images is small in relation to overall traffic. On others, it may represent a significant load. Only the webmaster can determine the value of efforts to block image access by spiders.

On my sites where the load is significant, I use mod_rewrite in .htaccess to block robots which do not obey robots.txt, as well as blocking off-site referrers (hot-linking).

HTH,
Jim

 

Featured Threads

Hot Threads This Week

Hot Threads This Month