Hi,
I'm handling an ecommerce website & while checking in WMT under Index Status tab, my pages in
"Not Index" are MORE than in
"Index". While reading the solution I came to know that Google is not indexing because some URLs are redirecting, some pages are duplicates & so.
I found out the URLs which are redirecting & removed them but I want to block those duplicate pages via robots.txt but I don't understand how to provide the pattern because there is some session ids & so. Like for example -
http://www.example.com/widget-red?ordernumber=12 http://www.example.com/widget-9922?pagenumber=3 So how do I suggest Google bot to not to index these pages...should I add the below line to block the above pages
Disallow: /?ordernumber= Disallow: /?pagenumber= OR this ->
Disallow: /*? Also, when people are searching on my website for any products & when doing the same via site search the following URL comes.
http://www.example.com/search?categories=0&q=widget+red When I checked the same URL on google to know whether it has been indexed or not via operator "site:" I found this URL to be on google
http://www.example.com/search?q= So, how do I block the above pages...Is the below way correct?
Disallow: /search?q= OR this
Disallow: /*search?q= Sorry for the long post but I'd appreciate if you can answer my query because lately I see many duplicate pages been indexed on google.
Thanks.
[edited by: goodroi at 2:58 pm (utc) on Oct 3, 2012]
[edit reason] Examplified [/edit]