Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

Webmaster Tools 'Web Crawl' and bogus query string links

         

barcodeuk

12:56 pm on Aug 18, 2009 (gmt 0)

10+ Year Member



Hi,

Our website is an oscommerce website and I've noticed google has indexed pages on our website with different query strings.

www.example.com/manufacturername.html?sort=2a
www.example.com/manufacturername.html?page=3
www.example.com/manufacturername.html?sort=3a&page=1

All the above URLs has the same content as the
www.example.com/page.html and we don't have any links for the above URLs on our website or sitemap.

manufacturername.html is a SEO URL. so the page name changes for different manufaturers.

How can I remove them from google index or redirect the above URLs to the original page. Or block google indexing the same page with different query strings.

I've used
Disallow: *sort=

on robots.txt. Will it stop google crawling pages with sort query string.

What can I use to block page= on the above URLs.

We have real pages with <?page=> on our website so I don't want to use Disallow: *page=

I've only noticed above query strings. Can there be more query strings which google has crawled or can crawl in future?

Please advice

Many Thanks

[edited by: Robert_Charlton at 6:04 pm (utc) on Aug. 18, 2009]
[edit reason] changed to example.com - it can never be owned [/edit]

tedster

7:16 pm on Aug 18, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



You've got it, I'd say.

That one Disallow rule will stop indexing of both the duplicate url types you mentioned. If they are already in the index and don't seem to go away, then you may need to do a removal request.

barcodeuk

7:41 am on Aug 19, 2009 (gmt 0)

10+ Year Member



Hi Tedster,

Thank you very much for your reply

Kind Regards
SD