homepage Welcome to WebmasterWorld Guest from 50.16.36.153
register, login, search, subscribe, help, library, PubCon, announcements, recent posts, open posts,
Pubcon Platinum Sponsor
Visit PubCon.com
Home / Forums Index / Search Engines / Sitemaps, Meta Data, and robots.txt
Forum Library : Charter : Moderators: goodroi

Sitemaps, Meta Data, and robots.txt Forum

    
Syntax for disallowing folder with '.' in the name?
Asking for clarification
ALbino




msg:3681887
 9:36 pm on Jun 23, 2008 (gmt 0)

Hey there,

I know virtually nothing about robots.txt so forgive me if this question is silly. I figured better safe than sorry. Anyway...

For whatever reason this site has it's directory structure setup like this:

http://www.example.com/product.php/0001
http://www.example.com/product.php/0002
http://www.example.com/product.php/0003
etc.

And:

http://www.example.com/company.php/1234
http://www.example.com/company.php/1235
http://www.example.com/company.php/1236
etc.

They want to block only the /company.php/* ones for fear of overlapping duplicate content (many of the companies only have only 1 product and thus the pages are virtually identical).

I was just wondering what the correct disallow syntax is for that? Thanks!

 

g1smd




msg:3681907
 9:57 pm on Jun 23, 2008 (gmt 0)

User-agent: *
Disallow: /compan

Put as much or as little of the /company.php/.... part into the Disallow statement as you like, enough to make it globally unique.

ALbino




msg:3681936
 10:40 pm on Jun 23, 2008 (gmt 0)

Great, thanks g1smd!

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Search Engines / Sitemaps, Meta Data, and robots.txt
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About
WebmasterWorld ® and PubCon ® are a Registered Trademarks of Pubcon Inc.
© Pubcon Inc. 1996-2012 all rights reserved