homepage Welcome to WebmasterWorld Guest from
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Home / Forums Index / Search Engines / Sitemaps, Meta Data, and robots.txt
Forum Library, Charter, Moderators: goodroi

Sitemaps, Meta Data, and robots.txt Forum

Possible to block all urls with ? in them?

WebmasterWorld Senior Member 5+ Year Member

Msg#: 3800439 posted 1:40 am on Dec 5, 2008 (gmt 0)

I am going to convert a shopping cart done in php to static using mod_rewrite and was cautioned that to avoid duplicate content issues, I need to block all the php urls to avoid dup content issues in google.
So does this work? If I add this into my robots.txt:
User-agent: *
Disallow: /*?

Will it block the search spiders from indexing all the old urls
that currently look like this:
[mydomain....] com/index.php?l=product_list&c=19

Does anyone have experience with this?
Basically I am trying to avoid listing all these old urls in my robots txt file. Make it awfully large with 3000+.



WebmasterWorld Administrator goodroi us a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

Msg#: 3800439 posted 1:28 pm on Dec 5, 2008 (gmt 0)

You are talking about using wildcards aka pattern matching. Google, Yahoo and MSN support this extra function. I have used it and it works just fine with the big three search engines.

Please remember that it is not officially part of the robots.txt protocol so the smaller bots will probably not follow the wildcard rules.

sidenote - if you are using mod rewrite than you might not even need to use robots.txt wildcards. by 301 redirecting all requests for urls with "?" into static url versions you wouldn't need this rule. having the rule wouldn't hurt either.

Global Options:
 top home search open messages active posts  

Home / Forums Index / Search Engines / Sitemaps, Meta Data, and robots.txt
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved