homepage Welcome to WebmasterWorld Guest from
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Home / Forums Index / Search Engines / Sitemaps, Meta Data, and robots.txt
Forum Library, Charter, Moderators: goodroi

Sitemaps, Meta Data, and robots.txt Forum

Controlling Crawler Access - different options

 11:28 am on Jul 29, 2010 (gmt 0)

I know the theory behind the different methods of controlling how a crawler accesses a site and I just wanted to sanity check my approach. We run a e-commerce site with pages generated dynamically by our cart software. Some pages, like categories have options to let you refine a list of items by price or manufacturer for example. These use the query string and essentially produce a page with no unique content and no improvement from the un-refined page as far as the search engines are concerned. With so many dynamic pages I think it is important that we guide crawlers by pointing them to important pages and not un-important ones. I see a few 'tools' available to me for tackling this :

1) Use a canonical link on the refined pages pointing to the un-refined version
2) Use a noindex meta tag to inform crawlers to ignore the refined page
3) Block in robots.txt tell the crawlers no to crawl the refined page in the first place
4) Add a nofollow attribute to all links pointing to these refined pages

What I am trying to achieve is helping the SEs crawl our site efficiently. Presently I use a combination of 2 & 4 my thinking being instruct the SEs not to follow links to these refined pages from within our navigation and if they access if from elsewhere (i.e. linked from somewhere other than our navigation) then they see the noindex meta tag.

Is this what you would do?



 6:45 pm on Aug 7, 2010 (gmt 0)

its what i would do. there are some different approaches you could take but they all have some downsides.


 9:57 pm on Aug 7, 2010 (gmt 0)

agreed - i would do what you are doing.


 9:20 am on Aug 9, 2010 (gmt 0)

Thanks for the replies - always nice to sanity check these things :)

Global Options:
 top home search open messages active posts  

Home / Forums Index / Search Engines / Sitemaps, Meta Data, and robots.txt
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved