homepage Welcome to WebmasterWorld Guest from 54.227.141.230
register, free tools, login, search, subscribe, help, library, announcements, recent posts, open posts,
Subscribe to WebmasterWorld

Visit PubCon.com
Home / Forums Index / Search Engines / Sitemaps, Meta Data, and robots.txt
Forum Library, Charter, Moderators: goodroi

Sitemaps, Meta Data, and robots.txt Forum

    
block all except in robots.txt
visualscope




msg:3350805
 8:40 pm on May 26, 2007 (gmt 0)

hi all,
i have a site with more webpages (duplication content issues) to block than to allow.
is there a way in robots.txt to achieve this?

I do know how to block pages from being crawled, but since I have more to block than allow, I was thinking it is probably easier to do the opposite.

thanks in advance

 

goodroi




msg:3351732
 2:26 pm on May 28, 2007 (gmt 0)

the easiest way would be to put all of your blocked pages into one directory and your good pages into another directory. then you could simply have one line in your robots.txt file blocking the entire bad directory.

if you do want to list individual pages on your robots.txt file be careful that your file doesn't get too big. i once had a client with a robots.txt file several hundred kb and the spiders had a hard time reading it. so avoid the extreme sizes and you'll be ok.

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Search Engines / Sitemaps, Meta Data, and robots.txt
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About
© Webmaster World 1996-2014 all rights reserved