Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

"Noindex, nofollow" revisit

         

doughayman

3:50 pm on Dec 2, 2009 (gmt 0)

10+ Year Member



Hi all,

I've poured through the old threads on this, and did not find an answer to my question, and am a bit confused. I just reviewed the Google's SEO Starter Guide, and this made me even more confused. Anyway, here goes:

Let's assume that I have 2 files in a particular folder of my website: let's call them Index1.htm and Index2.htm.

Index1.htm is the root page for a website, and it's "robots" meta tag is defined as "index, follow". This file IS found in my website sitemap.

Index2.htm is an old copy of Index1.htm that has been hanging around, and its "robots" meta tag is defined as "noindex, nofollow". This file is NOT found in my website sitemap.

Now, suppose that somehow Google spiders Index2.htm. It should not be indexed at all (by virtue of its local "noindex" tag), but also, should not follow any of its subordinate links (by virtue of its local "nofollow" tag).

Assuming that Index1.htm and Index2.htm have identical subordinate page links, could the spidering of Index2.htm result in any of the following:

1) Removal of any link juice to subordinate pages, that Index1.htm would
provide ?

2) Removal of any subordinate pages from Google's Index ?

Obviously, I can remove the extra page, Index2.htm, from the folder. My questions above assume that the file has remained in the folder.

Thank you in advance. I hope the above is clear.

helpnow

4:35 pm on Dec 2, 2009 (gmt 0)

10+ Year Member



In my experience, if it is noindex, google will delete it ASAP from the SERPs once it is crawled. Noindex has been working for me 100% of the time, and I have been using it a lot in the past 3 months. It is slow, but perfect. Sow meaning, it has taken sometimes longer than 2 months to get URLs out of the SWERPs using noindex. But they will come out of the SERPs if you don't do anything else.

You can expedite this process by using the URL Removal Request tool. I love the URL Removal Request - I can have URLs out of the SERPs in 2-4 hours, and any associated problems thus fixed in 0-5 days.

helpnow

4:41 pm on Dec 2, 2009 (gmt 0)

10+ Year Member



P.S. use rel=canonical on index2.htm to point to index1.htm. That will pass any juice over to index1. In my experience, and I am less sure of this as I am still testing, rel=canonical and a noindex can coexist. The bot has to see the noindex to knwo to take it out of the serps, and it will also see the rel=canonical. Just because there is a noindex on it does not mean googelbot will not crawl it - semantics here, but it will still crawl it, but not put it into the SERPs, from what I have seen, and thus a rel=canonical will serve as an on-page 301 redirect. Note that using the URL Removal Request may delete the URL before googlebot has a chance to see the rel=canonical, so you may delay the transfer of juice if you do that. URL Removal Request is great if you cannot wait and need to get the damn URL out of the SERPs now, at any cost.

doughayman

4:59 pm on Dec 2, 2009 (gmt 0)

10+ Year Member



Thanks for this info, but I am still looking to the answers to my 2 questions above, assuming that both files co-exist in the same folder.

creative craig

5:47 pm on Dec 2, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



It should not detract from the power of the links being passed from index1.html or the removal of any pages from Google.

FranticFish

6:12 pm on Dec 2, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Could the spidering of Index2.htm result in any of the following:
1) Removal of any link juice to subordinate pages, that Index1.htm would
provide ?
2) Removal of any subordinate pages from Google's Index ?

Given Google's less than perfect interpretation of dupe content and what to do with it, it is not possible to state with certainty that you wouldn't have problems IF the page were spidered.

But as others have said, whilst Google doesn't always seem to obey robots.txt I've never seen it ignore the metatag noindex command, so it's a moot point.

rainborick

6:15 pm on Dec 2, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



(1) The use of "noindex,nofollow" on index2.htm will have no effect on the link juice is passed from index1.htm to your subordinate pages. The only effect is that index2.htm will be removed from the index and any link juice that index2.htm might have been passing will stop flowing.

(2) The removal of index2.htm by using "noindex,nofollow" will not automatically cause any subordinate pages to be removed from the index. However, if index2.htm were providing the only link to a subordinate page, that subordinate page would likely fall out of the index eventually. Given your scenario, this seems an extremely unlikely event.

doughayman

7:14 pm on Dec 2, 2009 (gmt 0)

10+ Year Member



OK, sounds like a good consensus. Thanks to all who replied - much appreciated !