Welcome to WebmasterWorld Guest from

Forum Moderators: goodroi

Message Too Old, No Replies

Possible to block all urls with ? in them?



1:40 am on Dec 5, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member

I am going to convert a shopping cart done in php to static using mod_rewrite and was cautioned that to avoid duplicate content issues, I need to block all the php urls to avoid dup content issues in google.
So does this work? If I add this into my robots.txt:
User-agent: *
Disallow: /*?

Will it block the search spiders from indexing all the old urls
that currently look like this:
[mydomain....] com/index.php?l=product_list&c=19

Does anyone have experience with this?
Basically I am trying to avoid listing all these old urls in my robots txt file. Make it awfully large with 3000+.


1:28 pm on Dec 5, 2008 (gmt 0)

WebmasterWorld Administrator goodroi is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

You are talking about using wildcards aka pattern matching. Google, Yahoo and MSN support this extra function. I have used it and it works just fine with the big three search engines.

Please remember that it is not officially part of the robots.txt protocol so the smaller bots will probably not follow the wildcard rules.

sidenote - if you are using mod rewrite than you might not even need to use robots.txt wildcards. by 301 redirecting all requests for urls with "?" into static url versions you wouldn't need this rule. having the rule wouldn't hurt either.


Featured Threads

Hot Threads This Week

Hot Threads This Month