homepage Welcome to WebmasterWorld Guest from 54.197.215.146
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Home / Forums Index / Search Engines / Sitemaps, Meta Data, and robots.txt
Forum Library, Charter, Moderators: goodroi

Sitemaps, Meta Data, and robots.txt Forum

    
Robots.txt Clarification
smartcard




msg:3497413
 5:41 am on Nov 6, 2007 (gmt 0)

I know that I can Disallow a file, but can I Disallow a file URL that are coming with extra values such as

www.mydomain.com/index.php?cat_id=234

Can I make my robots.txt file with the following line to avoid the above type of URLs not to be index in search engines?

Disallow: /index.php?cat_id=

 

phranque




msg:3497431
 6:30 am on Nov 6, 2007 (gmt 0)

Disallow is supposed to work for files and directories and you may specify a partial name.
however, there is no mention of support for query strings in the protocol, so i wouldn't count on anything there...

Achernar




msg:3497687
 1:25 pm on Nov 6, 2007 (gmt 0)

As far as I know, and by what I've seen of bots' behavior, if you don't place a wildcard in the middle of the argument string, even msnbot will understand your syntax.

smartcard




msg:3497703
 1:59 pm on Nov 6, 2007 (gmt 0)

I have done the following:

Disallow: /index.php?

Any suggestions and recommendations welcome.

LordLink




msg:3503568
 10:44 am on Nov 13, 2007 (gmt 0)

Using? on index.php tells search engines to ignore all the files with arguments

You can try using wildcards if you want to disallow only the files with cat_id for example:

Disallow: /*cat_id=*

[edited by: LordLink at 10:45 am (utc) on Nov. 13, 2007]

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Search Engines / Sitemaps, Meta Data, and robots.txt
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved