homepage Welcome to WebmasterWorld Guest from
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Home / Forums Index / Search Engines / Sitemaps, Meta Data, and robots.txt
Forum Library, Charter, Moderators: goodroi

Sitemaps, Meta Data, and robots.txt Forum

Robots.txt Clarification

 5:41 am on Nov 6, 2007 (gmt 0)

I know that I can Disallow a file, but can I Disallow a file URL that are coming with extra values such as


Can I make my robots.txt file with the following line to avoid the above type of URLs not to be index in search engines?

Disallow: /index.php?cat_id=



 6:30 am on Nov 6, 2007 (gmt 0)

Disallow is supposed to work for files and directories and you may specify a partial name.
however, there is no mention of support for query strings in the protocol, so i wouldn't count on anything there...


 1:25 pm on Nov 6, 2007 (gmt 0)

As far as I know, and by what I've seen of bots' behavior, if you don't place a wildcard in the middle of the argument string, even msnbot will understand your syntax.


 1:59 pm on Nov 6, 2007 (gmt 0)

I have done the following:

Disallow: /index.php?

Any suggestions and recommendations welcome.


 10:44 am on Nov 13, 2007 (gmt 0)

Using? on index.php tells search engines to ignore all the files with arguments

You can try using wildcards if you want to disallow only the files with cat_id for example:

Disallow: /*cat_id=*

[edited by: LordLink at 10:45 am (utc) on Nov. 13, 2007]

Global Options:
 top home search open messages active posts  

Home / Forums Index / Search Engines / Sitemaps, Meta Data, and robots.txt
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved