homepage Welcome to WebmasterWorld Guest from 54.204.249.184
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Home / Forums Index / Search Engines / Sitemaps, Meta Data, and robots.txt
Forum Library, Charter, Moderators: goodroi

Sitemaps, Meta Data, and robots.txt Forum

    
Robots.txt exclusion for dynamic page
rogerd




msg:1525777
 8:42 pm on Jun 17, 2003 (gmt 0)

For some reason, I am having difficulty with some shopping cart pages getting indexed despite attempts to stop this. The pages take the form of,
www.domain.com/store/cart.asp?product=1234

My robots.txt file contains the following:

User-agent: *
Disallow: /cgi-bin/
Disallow: /store/cart.asp

The file validates using Brett's checker. Do I need to change the syntax of the URL to make this work?

 

jdMorgan




msg:1525778
 9:22 pm on Jun 17, 2003 (gmt 0)

rogerd,

Your robots.txt should prevent spiders from fetching your shopping cart pages as-is. However, Google will list any page it finds a link to, even without crawling that page. So, the answer depends entirely on what you mean by "cart pages getting indexed". Are these pages listed with a title and description, or is it just the URL that is showing in the SERPs?

In order to prevent the "list just the URL" scenario, the solution is counter-intuitive: You must allow Google to fetch the page, and then use the <meta name="robots" content="noindex"> tag on each page. In the case of a large site with dynamic URLs, this might be easiest done with a "light cloak" by redirecting SEs to a "noindex" page. Since there is no attempt to mislead a searcher, there should be no risk of penalty.

The "list just the URL" problem exists with Ask Jeeves/Teoma, as well as Google.

HTH,
Jim

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Search Engines / Sitemaps, Meta Data, and robots.txt
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved