homepage Welcome to WebmasterWorld Guest from 54.237.134.62
register, free tools, login, search, subscribe, help, library, announcements, recent posts, open posts,
Subscribe to WebmasterWorld

Home / Forums Index / Google / Google SEO News and Discussion
Forum Library, Charter, Moderators: Robert Charlton & aakk9999 & brotherhood of lan & goodroi

Google SEO News and Discussion Forum

    
How to noindex sort pages - page.php?sortby
sauce




msg:4357321
 7:43 pm on Aug 31, 2011 (gmt 0)

Trying to recover an ecommerce site from Panda and I've found tons of dupes in G from sort pages... each sort had 6 functions so basically I have 6 dupe pages of each category/product.

For the sort pages:

http://www.example.com/categorypage.php?sortby=price&asc

I've added:

<link rel="canonical" href="http://www.example.com/categorypage.php" />

to the headers but I'm wondering if I should add:

<meta name="ROBOTS" content="NOINDEX,FOLLOW">

to the sort page headers also... Will this also deindex: http://www.example.com/categorypage.php ?\


Also, is there any robots.txt regex or something I cna use to block after the ?

ie: Disallow: /*.php?*

Panda Sux!

Thx

[edited by: tedster at 8:01 pm (utc) on Aug 31, 2011]
[edit reason] switch to example.com [/edit]

 

tedster




msg:4357450
 3:09 am on Sep 1, 2011 (gmt 0)

Disallow: /*.php?sortby

That's all you need to stop crawling the sortby URLs. The rules in robots.txt are naturally considered to be the start of a pattern so the final asterisk is not needed. And if you never need to see any query string indexed of any kind, then Disallow: /*.php? would do the job.

However, the noindex robots meta is also a good idea, since Google sometimes does index a URL even though they haven't crawled it.

Another step you could take is to use the feature in WebmasterTools where you tell Google which parameters to ignore.

doc_z




msg:4357512
 7:29 am on Sep 1, 2011 (gmt 0)

I had the same problem and I'm using the canonical tag. It took some until Google fixed it, but for me it seems to be the best way.

You can also use "noindex,follow", but not "noindex,follow" and the canonical tag at the same time.

I wouldn't use robots.txt to block URLs because it's a waste of link power.

tangor




msg:4357518
 7:43 am on Sep 1, 2011 (gmt 0)

I wouldn't use robots.txt to block URLs because it's a waste of link power.

I question that statement since robots.txt has NOTHING to do with links!

But I'm willing to be educated with examplars which indicate that robots.txt is injurious to link juice (power, et. al.)

This is one of those don't pee on my leg and tell me it's raining kind of things.

tedster




msg:4357646
 3:55 pm on Sep 1, 2011 (gmt 0)

I wouldn't use robots.txt to block URLs because it's a waste of link power.

For me, it depends on how much crawling there is for those parameters. I'd rather that bots didn't even request those URLs, but I suppose at a very low level might be OK.

schuon




msg:4357673
 5:10 pm on Sep 1, 2011 (gmt 0)

I wouldn't use robots.txt to block URLs because it's a waste of link power.

Well, if you block a page that get's external link power via robots.txt, that link power is lost. If you'd do a noindex, follow instead it can be passed on. In this case though, I'd assume you don't have that many external links on sort-by "price".

I used a canonical before to tell Google it's all the same page, and once the duplicate variants got removed from the index, I blocked it with robots.txt. From my experience stuff that was indexed and immediately got blocked via robots.txt tends to stay around in the index, somewhere deep down...

doc_z




msg:4357872
 6:23 am on Sep 2, 2011 (gmt 0)

I question that statement since robots.txt has NOTHING to do with links!


Of course it has to do with links because it generates dead ends in the linking scheme and prevents link power flowing around (in contrast to "noindex,follow").

jerednel




msg:4358003
 3:51 pm on Sep 2, 2011 (gmt 0)

I've found Google's upgraded parameter handling tool in WMT works pretty well for this type of thing. Although URLs still linger.

netmeg




msg:4358022
 4:49 pm on Sep 2, 2011 (gmt 0)

Eh, I pretty much block everything with a question mark in the URL in robots.txt, and the sites are doing quite well. FWIW.

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Google / Google SEO News and Discussion
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About
© Webmaster World 1996-2014 all rights reserved