homepage Welcome to WebmasterWorld Guest from 54.224.53.192
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Home / Forums Index / Search Engines / Sitemaps, Meta Data, and robots.txt
Forum Library, Charter, Moderators: goodroi

Sitemaps, Meta Data, and robots.txt Forum

    
Robots.txt Clarification
smartcard

10+ Year Member



 
Msg#: 3497411 posted 5:41 am on Nov 6, 2007 (gmt 0)

I know that I can Disallow a file, but can I Disallow a file URL that are coming with extra values such as

www.mydomain.com/index.php?cat_id=234

Can I make my robots.txt file with the following line to avoid the above type of URLs not to be index in search engines?

Disallow: /index.php?cat_id=

 

phranque

WebmasterWorld Administrator phranque us a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



 
Msg#: 3497411 posted 6:30 am on Nov 6, 2007 (gmt 0)

Disallow is supposed to work for files and directories and you may specify a partial name.
however, there is no mention of support for query strings in the protocol, so i wouldn't count on anything there...

Achernar

5+ Year Member



 
Msg#: 3497411 posted 1:25 pm on Nov 6, 2007 (gmt 0)

As far as I know, and by what I've seen of bots' behavior, if you don't place a wildcard in the middle of the argument string, even msnbot will understand your syntax.

smartcard

10+ Year Member



 
Msg#: 3497411 posted 1:59 pm on Nov 6, 2007 (gmt 0)

I have done the following:

Disallow: /index.php?

Any suggestions and recommendations welcome.

LordLink

5+ Year Member



 
Msg#: 3497411 posted 10:44 am on Nov 13, 2007 (gmt 0)

Using? on index.php tells search engines to ignore all the files with arguments

You can try using wildcards if you want to disallow only the files with cat_id for example:

Disallow: /*cat_id=*

[edited by: LordLink at 10:45 am (utc) on Nov. 13, 2007]

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Search Engines / Sitemaps, Meta Data, and robots.txt
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved