homepage Welcome to WebmasterWorld Guest from 54.167.179.48
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Home / Forums Index / Google / Google SEO News and Discussion
Forum Library, Charter, Moderators: Robert Charlton & aakk9999 & brotherhood of lan & goodroi

Google SEO News and Discussion Forum

This 67 message thread spans 3 pages: < < 67 ( 1 [2] 3 > >     
Downside to noindex?
realmaverick

WebmasterWorld Senior Member 5+ Year Member



 
Msg#: 4282288 posted 4:28 am on Mar 16, 2011 (gmt 0)

After revisiting the structure of my website after several years, I'm finding more and more absolutely useless stuff indexed. I want them deindexed and claim back the link juice and crawl allowance.

An example, new user signs up, it appears in the timeline with their username hyperlinked, Google follows this to their empty profile. That then links to several empty pages, where their content will be, should they create any. Of course many many users don't.

I've now un-hyperlinked the username in this instance, removed the links in the profile if no content exists and I've added noindex, follow to the proceeding pages.

I hate using these tags, just as I hated using nofollow way back when. But sometimes, it seems necessary.

Is their any downside to what I've done with the noindex, follow? Is Google likely to give a crap that I've just told it, that it's not to index half a million pages?

 

pageoneresults

WebmasterWorld Senior Member pageoneresults us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 4282288 posted 8:13 pm on Mar 17, 2011 (gmt 0)

Hmmm, I did have the privilege of reviewing the document in question. There is a META Refresh in the process too. Google will typically treat those as 301s. So, there's a bit more to this particular instance than most others would be dealing with.

Disallow: > META Refresh > noindex, nofollow

A bit confusing yes? Googlebot SHOULD have only gotten the Disallow which is what happened. Now it shows your standard URI only entry.

Simsi

WebmasterWorld Senior Member 5+ Year Member



 
Msg#: 4282288 posted 8:16 pm on Mar 17, 2011 (gmt 0)

Although this particular example is an affiliate redirect script and there are no external links to it at all, I just sussed out the title is made up of two parts: the anchor text of the ffiliate link and the title of the referring page (in this case the site name). So that's logical - thanks.

So am I right in thinking, use robots.txt to disallow directories and META NOINDEX tags to make sure specific pages don't get indexed?

TheMadScientist

WebmasterWorld Senior Member themadscientist us a WebmasterWorld Top Contributor of All Time 5+ Year Member



 
Msg#: 4282288 posted 8:27 pm on Mar 17, 2011 (gmt 0)

Fascinating... <snip>

setzer



 
Msg#: 4282288 posted 8:47 pm on Mar 17, 2011 (gmt 0)

Personally I feel noindex is a poor solution. If you have links that you don't want Google to index, why are they on your site? If it's a structural issue - fix it.

TheMadScientist

WebmasterWorld Senior Member themadscientist us a WebmasterWorld Top Contributor of All Time 5+ Year Member



 
Msg#: 4282288 posted 8:59 pm on Mar 17, 2011 (gmt 0)

Did anyone who followed the link aakk9999 posted make it to this thread?

Why Google Might "Ignore" a robots.txt Disallow Rule [webmasterworld.com]

Definitely worth the read.

tedster

WebmasterWorld Senior Member tedster us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 4282288 posted 9:10 pm on Mar 17, 2011 (gmt 0)

Good reminder. One big point - if your robots.txt has a
User-agent: Googlebot section, then you need to include ALL the rules you expect googlebot to follow in that section. The rules in a User-agent: * section will not have any effect.
g1smd

WebmasterWorld Senior Member g1smd us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 4282288 posted 9:11 pm on Mar 17, 2011 (gmt 0)

I met another site with the exact same problem only last month.

The information in that thread is as true now as it was in 2006. :)

There are also definitive answers from both GoogleGuy and Vanessa Fox.

Those were the days!

TheMadScientist

WebmasterWorld Senior Member themadscientist us a WebmasterWorld Top Contributor of All Time 5+ Year Member



 
Msg#: 4282288 posted 9:22 pm on Mar 17, 2011 (gmt 0)

There are also definitive answers from both GoogleGuy and Vanessa Fox.

Yeah, remember when they used to remember us over here and at least pretended like they cared?
You're right g1smd, those were the days...

Simsi

WebmasterWorld Senior Member 5+ Year Member



 
Msg#: 4282288 posted 9:32 pm on Mar 17, 2011 (gmt 0)

Did anyone who followed the link aakk9999 posted make it to this thread?

Why Google Might "Ignore" a robots.txt Disallow Rule [webmasterworld.com]

Definitely worth the read.


Thanks TheMadScientist - all is now clear :-)

helpnow

5+ Year Member



 
Msg#: 4282288 posted 9:45 pm on Mar 17, 2011 (gmt 0)

@setzer

Here's one reason why. We are a manufacturer and an ecomm, and we sell 1000s of products and have been doing so since the 90s. Some of those products exist in different sizes, and have different reviews, links etc, and stand on their own as separate URLs. But, it is silly to get google to index all the variations... So, we set one on them as a master and have it indexed, and set the variations to noindex. For legacy reasons, the pros outweigh the cons, and a noindex gives us a reasonable solution.

walkman



 
Msg#: 4282288 posted 10:00 pm on Mar 17, 2011 (gmt 0)

Google (and Bing) themselves said "Noindex" if you intend to add more to the page eventually, suggesting that noindex pages don't count in SERPS and use 404/410 if the page will not be coming back. Probably has to do with page age or something, if you delete you lose it. 410 is a bit faster and it's gone for good, no need to recheck it for Googlebot.

On my noindex pages all the internal navigation is removed

browsee



 
Msg#: 4282288 posted 1:11 am on Mar 18, 2011 (gmt 0)

@walkman, why did you remove internal navigation? It is noindex right, does it matter?

tedster

WebmasterWorld Senior Member tedster us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 4282288 posted 1:25 am on Mar 18, 2011 (gmt 0)

Noindex still means that the links on the page are followed and they will circulate PageRank (unless you use "noindex,nofollow"). For that reason, I usually don't change anything about internal navigation on a noindexed page.

So I'm curious about your thinking here, too, walkman.

browsee



 
Msg#: 4282288 posted 2:18 am on Mar 18, 2011 (gmt 0)

@tedster, I saw a success story this morning, they are using 'noindex, follow' instead of just 'noindex'. Which one do you prefer after Panda update?

g1smd

WebmasterWorld Senior Member g1smd us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 4282288 posted 2:30 am on Mar 18, 2011 (gmt 0)

"Follow" is the default action, so stating it or not stating it makes no difference.

walkman



 
Msg#: 4282288 posted 3:04 am on Mar 18, 2011 (gmt 0)

@walkman, why did you remove internal navigation? It is noindex right, does it matter?


Drains PR and my pages are not static, in the sense that I have to update them quite often. It takes time and I can dedicate that on a smaller site. So, I'm not taking any chances and removed maybe too much but when I come back I can make it up with the rest. Once that happens I can add the pages back, one by one, after updating them of course. They are not deleted in my CMS, just 'up in the air,' and out of navigation, search and everything until I check a box back.

tedster

WebmasterWorld Senior Member tedster us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 4282288 posted 3:06 am on Mar 18, 2011 (gmt 0)

Thanks, I think I see your point of view, now. There are now no internal links pointing to those URLs either - right?

TheMadScientist

WebmasterWorld Senior Member themadscientist us a WebmasterWorld Top Contributor of All Time 5+ Year Member



 
Msg#: 4282288 posted 5:17 am on Mar 18, 2011 (gmt 0)

On my noindex pages all the internal navigation is removed

Drains PR

The links to the pages drain PR equally from internal AND external links, but the links on them back to your indexed pages don't drain PR from anywhere, except the outbound links ... I really don't get it?

I can see not linking to them, but what you do by not linking back to yourself from them is increase the PR passed by every outbound link ... The outbound links pass more PR when you remove the internal links ... What am I missing?

walkman



 
Msg#: 4282288 posted 5:24 am on Mar 18, 2011 (gmt 0)

Madscientist,

Home > category > page
Home > alphabet > page

If I have 100 on the categories it is much worst than having 30. How much flows back, I don't know...

TheMadScientist

WebmasterWorld Senior Member themadscientist us a WebmasterWorld Top Contributor of All Time 5+ Year Member



 
Msg#: 4282288 posted 5:34 am on Mar 18, 2011 (gmt 0)

You increase the external PR flowed by every link on your site, and decrease the internal PR you keep ... Example:

If page A has 100 links on it and 10 of those are external (90 internal links + 10 external links), when you remove 70 links to your site you are left with 20 internal and 10 external. You inadvertently cause the external links to flow more PR by removing the links to your own pages.

You don't increase the PR of the page those links are on. You change the amount 'awarded' (flowed through) to each link. Yes, the amount each internal link passes goes up, but so does the amount the external links pass. By having more internal links you 'dampen' the amount flowed through the external links. By removing internal links you 'dampen' the value you keep internally overall.

You started off with 90 overall links to your site on the page. You ended up with 20 overall links to your site on the page. There are a constant 10 external links. Who lost link weight by removing the links, you or the external pages?

10 points / 100 links = .1 passed by each internally and externally.
10 points / 30 links =.33 passed by each internally and externally.

90 x .1 = 9 points internal.
10 x .1 = 1 point external.

20 x .33 = 6.6 points internal.
10 x .33 = 3.3 points external.

You are taking link weight away from yourself and sending it to the competition.

walkman



 
Msg#: 4282288 posted 5:47 am on Mar 18, 2011 (gmt 0)

I understand your point, and maybe as PR keeps going in circles I end up with the same amount. But cleaner pages are worth it.

I have a few external links on each page and they are all on the last page. It's hard to get direct links to my 'product pages' so everything flows from the homepage and 3 other sections.

TheMadScientist

WebmasterWorld Senior Member themadscientist us a WebmasterWorld Top Contributor of All Time 5+ Year Member



 
Msg#: 4282288 posted 5:52 am on Mar 18, 2011 (gmt 0)

Got it, so it might be something that works for your specific situation, but I don't think I'd recommend it for everyone ... Just wanted to make sure I wasn't missing something silly, because you really made me think about it for a few minutes. ;)

Of course, now I am starting to wonder how many people get into a situation with a 'loss of rankings' or 'penalty' of some type and do something similar, which could (theoretically) cause them to lower their own rankings more and increase the rankings of others more?

I think it could be more than I would have guessed yesterday ... lol

tedster

WebmasterWorld Senior Member tedster us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 4282288 posted 7:59 am on Mar 18, 2011 (gmt 0)

You are taking link weight away from yourself and sending it to the competition.

Only if you are linking TO your competition. And even, this is only the PR calculation, just part of the total algorithm. Google's algo definitely does something else that helps sites that link out.

Two years ago I tested two sites, launched at the same. I kept everything as parallel as possible except that one never linked out and the other linked out at least twice from every article. The one with external links was soon ranking better. Both were direct sales sites (not affiliates).

What Google does in cases like this, I can only guess, so I won't. But it seems clear to me they do something, both from my own testing and from some cryptic comments made by Matt Cutts from time to time on this topic of external linking.

Future

5+ Year Member



 
Msg#: 4282288 posted 10:21 pm on Mar 23, 2011 (gmt 0)

In order to even see the noindex meta tag, googlebot must crawl the page. It may crawl less frequently after it verifies the noindex a few times, but it must continue to crawl.

This is accepted but can a page get indexed in SERPs with no-index tag ?
Regardless of any keyword it contains ?

This is very important and meaningful for existence of this tag.

Sgt_Kickaxe

WebmasterWorld Senior Member sgt_kickaxe us a WebmasterWorld Top Contributor of All Time



 
Msg#: 4282288 posted 10:38 pm on Mar 23, 2011 (gmt 0)

Another trick to sites with profile pages that are often unpopulated is to place the link behind some logged in only code. If the links only show up when logged in they don't show up for the average user or search engine. You can safely no-index pages that are for members only.

Future

5+ Year Member



 
Msg#: 4282288 posted 10:49 pm on Mar 23, 2011 (gmt 0)

If the links only show up when logged in they don't show up for the average user or search engine.

cloaking ?

pageoneresults

WebmasterWorld Senior Member pageoneresults us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 4282288 posted 10:53 pm on Mar 23, 2011 (gmt 0)

This is accepted but can a page get indexed in SERPs with no-index tag?


I've never seen it happen and I've been using it for years now. When someone does claim that they've found noindex documents in the SERPs, there is ALWAYS a reason why. Most of the time it is because the document is disallowed via robots.txt so the noindex is not being seen. I've not seen any other instances that I can remember that weren't due to robots.txt.

noindex does just as it says on the tin, the document WILL NOT appear in the index. You can perform site: searches and you'll see that they will not get returned, no matter how advanced you get with the queries. If you do find noindex documents in the SERPs, then something is technically wrong.

It is one of the few protocols that all the SEs adhere to. It is referred to as the REP (Robots Exclusion Protocol) and there are only three choices...

noindex
nofollow
noindex, nofollow

Anything else is someone's misinterpretation of the protocol.

Personally I feel noindex is a poor solution. If you have links that you don't want Google to index, why are they on your site? If it's a structural issue - fix it.


Use of noindex is to prevent documents from appearing in the index. Use of nofollow prevents the links in that document from being followed. I use noindex to take the cruft out of the equation. For example, I don't want upper level category pages indexed in most instances. I want them to get crawled, and the bot to follow the links but I don't want the document indexed, there are more valuable documents further down the breadcrumb which is what I want indexed, the money stuff. I like to conserve equity wherever I can. There is no need for the intermediary pages to be on the front line.

Future

5+ Year Member



 
Msg#: 4282288 posted 11:03 pm on Mar 23, 2011 (gmt 0)

If you do find noindex documents in the SERPs, then something is technically wrong.

Thank You for clarifying pageoneresults.

My last all-in-one question.
All noindex docs are only meant for self-non-disclosure, but can disclose anything they contain + pass on link-juice and any other META info.

If my understanding above is correct, noindex tag has a very great meaning..

pageoneresults

WebmasterWorld Senior Member pageoneresults us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 4282288 posted 11:15 pm on Mar 23, 2011 (gmt 0)

All noindex docs are only meant for self-non-disclosure, but can disclose anything they contain + pass on link-juice and any other META info.


Not sure I fully understand the question.

noindex documents are just that, they are not indexed, they are invisible to anyone searching for them. Unlike robots.txt entries which can easily be found via a site: search.

Google states that they will download and crawl the noindex document but it WILL NOT appear in their index. And yes, noindex documents pass value, you want them too. For example, if you are using noindex on intermediary catalog pages, you want the bot to crawl and give credit for everything that is there. From my perspective, it's like telling the bot "Hey, you can crawl this to maintain the integrity of the linking architecture and the document semantics, but I don't want you to index it."

I think of it this way, if you have a site with 100k documents, there's a good chance that a large percentage of those are intermediary documents that may not be worth indexing. Especially if you are allocated a certain amount of crawl budget and pages indexed. You want to take whatever equity comes your way and direct it to the documents that deserve it the most.

The use of noindex, nofollow adds another level of protection to the document. My experiences shows me that noindex, nofollow effectively blocks the page from obtaining and/or passing value, it is removed from the equation. We do this with login pages and other documents that have no value from an indexing perspective but are a must for the user.

Future

5+ Year Member



 
Msg#: 4282288 posted 11:36 pm on Mar 23, 2011 (gmt 0)

Thank You again pageoneresults.
Now is strongly beleive,
noindex is only meant for NOT-GETTING or NOT-LETTING document indexed in SERPs.

noindex does adds protection level to document as stated in example above, but definetly differs from nofollow.. (not to get confused)

Aparently.... I have observed a ranking site getting de-ranked in SERPs due to high-usage of noindex tag on there non-required pages, except there main node/thread/content/article/topic pages.

Not available in SERPs now.. ?

pageoneresults

WebmasterWorld Senior Member pageoneresults us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 4282288 posted 12:08 am on Mar 24, 2011 (gmt 0)

I have observed a ranking site getting de-ranked in SERPs due to high-usage of noindex tag on there non-required pages, except there main node/thread/content/article/topic pages.


If you're saying that the use of noindex caused that site to get deranked overall, that would be a concern. I've always been an avid fan of noindex for documents that I don't want users landing on after performing a search. Also, I don't want anyone scraping site: searches and finding my entire site indexed.

I've seen no ill effects of using noindex and/or noindex, nofollow where appropriate. When dealing with larger sites, there are quite a few intermediary documents that I feel don't need to be indexed. If a user landed on one of them, they'd have to click once or twice more to find what they were looking for. My thinking is to just remove that hodgepodge from the equation and provide the bot a direct indexing path to the primary content. All those directory style listings that are paginated get the noindex treatment. It's the click after that which counts. The final destination.

Not available in SERPs now?


It's possible that they overcooked things and removed "too much" from the equation, who knows...

This 67 message thread spans 3 pages: < < 67 ( 1 [2] 3 > >
Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Google / Google SEO News and Discussion
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved