I'm handling an ecommerce website & while checking in WMT under Index Status tab, my pages in "Not Index"
are MORE than in "Index"
. While reading the solution I came to know that Google is not indexing because some URLs are redirecting, some pages are duplicates & so.
I found out the URLs which are redirecting & removed them but I want to block those duplicate pages via robots.txt but I don't understand how to provide the pattern because there is some session ids & so. Like for example - http://www.example.com/widget-red?ordernumber=12 http://www.example.com/widget-9922?pagenumber=3
So how do I suggest Google bot to not to index these pages...should I add the below line to block the above pages Disallow: /?ordernumber= Disallow: /?pagenumber=
OR this -> Disallow: /*?
Also, when people are searching on my website for any products & when doing the same via site search the following URL comes. http://www.example.com/search?categories=0&q=widget+red
When I checked the same URL on google to know whether it has been indexed or not via operator "site:" I found this URL to be on google http://www.example.com/search?q=
So, how do I block the above pages...Is the below way correct? Disallow: /search?q=
OR this Disallow: /*search?q=
Sorry for the long post but I'd appreciate if you can answer my query because lately I see many duplicate pages been indexed on google.
[edited by: goodroi at 2:58 pm (utc) on Oct 3, 2012]
[edit reason] Examplified [/edit]