| This 58 message thread spans 2 pages: < < 58 ( 1  ) || |
|PR Sculpting Doesn't Work and Internal NoFollow Can Harm Your Site|
In Matt's latest video blog titled Should I use the nofollow attribute on internal links? [youtube.com] he takes 2 minutes to emphatically say "NO! DON'T USE REL=NOFOLLOW ON INTERNAL LINKS!"
I've never personally used it and have never recommended it for internal "page rank sculpting" but some people still promote this crazy trend regardless of the fact that Matt has pretty much emphatically said "don't do it" forever.
I knew someone that attempted to sculpt their internal links once and Google dropped some very odd pages from their site apparently from having the flow of page rank broken throughout the site. Once they removed the NOFOLLOW from the internal links the problem quickly resolved itself.
Again, internal page rank sculpting via REL=NOFOLLOW is a BAD IDEA and can do more harm than good.
What about subscriber only daily listing pages.
e.g. Each day I list all Widget Events for each city.
Boston: Widget World, Widget Expo
Elmira: Widgets R Us
The widget_events.html page is world readable, but only subscribers can access the event pages. Thus to prevent SEs indexing or knowing that the events pages exist I set:
a rel="nofollow, noindex" href="boston/widget_world.html
a rel="nofollow, noindex" href="boston/widget_expo.html
This way SEs don't get a boilerplate 'login to view' page recorded against each page. Is this now not a good thing to do?
Yes I could disallow widget_world.html in robots.txt but there could be thousands of combinations of cities and events!
And what is the difference between rel="nofollow" and a nofollow meta tag on each page? Is it now also a bad idea to use
meta name="robots" content="noindex,nofollow"
I thought these were *good* practices even if some had tried to use it to sculpt PR. Surely it is still ok for non informational pages such as login pages, copyright notices, privacy info pages.
|And what is the difference between rel="nofollow" and a nofollow meta tag on each page? Is it now also a bad idea to use |
meta name="robots" content="noindex,nofollow"
This is worth contrasting with...
meta name="robots" content="noindex,follow"
...which I think is often the best way to keep a page out of the index while maintaining PageRank circulation within your site. It doesn't solve all problems, but it can minimize the rel="nofollow" black hole effect.
|I thought these were *good* practices even if some had tried to use it to sculpt PR. Surely it is still ok for non informational pages such as login pages, copyright notices, privacy info pages. |
I've followed this practice for those sorts of pages, but now am wondering. The frustrating thing is that I'll see a privacy page, for instance, outrank money pages and, worst of all, those with Pulitzer Prize worthy content.
|If we allow these listings to be indexed then google leaves lots of our city pages which are more important and indexes these listings. The same situations being discussed here [youtube.com...] |
Note that the video from Matt Cutts is discussing internal duplication, rather than content that is duplicated across many domains.
I still feel it's best to let Google crawl and index all those URLs. But I'd also make darn sure that the navigational structure gives very clear signals as to what is the CORE content and what is more ephemeral -- especially if the ephemeral content is going to be a large number of URLs compared to the core.
Google can usually figure out what content is not worth much energy from it, and what is more important and typical for the site's main purpose. If they start to get it wrong, that's when I'd begin thinking about further steps.
|If they start to get it wrong, that's when I'd begin thinking about further steps |
That is what makes me and other people do something like that. Google indexes those listings while leaving other important pages.
Moreover As Robert mentioned if we put those listings in noindex then we can put their links in nofollow. Right?
If this is user generated content and it gets submitted at a rapid pace, then I think you should put nofollow on those links, whether you let those pages get indexed or not. I would do it only for the UGC links and not for the rest of the links on the page.
You're not doing it to sculpt PR - you're using the nofollow attribute for its original purpose and protecting yourself from links pointing to bad neighborhoods.
Yes exactly we do not have any control over this content and this is copied on many sites at the same time. Moreover these pages are so many in numbers that they exhaust all quota from gbot and our other important pages are left behind. While putting them in noindex and their links in nofollow many of site owners were not even aware of PR sculpting. Their main concern is duplicate and junk content and feeding google with least important pages while ignoring more important pages.
|Moreover As Robert mentioned if we put those listings in noindex then we can put their links in nofollow. Right? |
curioustoddler - I wasn't suggesting noindex,follow for your situation... I was responding specifically to Frank_Rizzo's question.
With regard to your situation, when you say this...
|Moreover these pages are so many in numbers that they exhaust all quota from gbot and our other important pages are left behind. |
...it makes me think that this is a structural problem...
What you need to do is to make sure your pages get the most internal PageRank distribution and are crawled before the other listings are crawled. Most important here is your internal navigation. As tedster suggested earlier, you want your "core" city pages to be higher up in the nav structure than the ever-changing listings pages... and you may also want your own listings to be higher up in the nav structure than other agents' listings.
But if all listings are linked from the same page, then the rel="nofollow" on the links to the other agents would have the same effect as leaving those links as dofollows.
You need to set up the nav structure so that the other agents' listings aren't siphoning off PageRank (and distracting Googlebot) from the pages that you do want Google to spider, index, and rank.
That's not the same as keeping them out of the index or preventing them from getting spidered.
It is about creating a nav structure where Google sees your main pages and your listings on pages higher up in the nav structure, before the remaining PageRank is split among all the other listing pages.
But if we keep those listings in noindex, how can it harm the site?
|But if we keep those listings in noindex, how can it harm the site? |
Let me turn the question around, since you understand your goals better than I do.... How do you think it can help?
I do not expect any help by getting these pages indexed. I am more concerned about any negative effect of these pages being in noindex and links to these pages in nofollow.
< moved from another location >
I have a few hundred older posts on my blog that give me no value that I no longer want indexed or using pagerank (if you believe that PR sculpting works).
Last week I started sculpting by making these pages noindex, nofollow and have quickly noticed a nice uptrend in my money keyword rankings each day as these pages are being deindexed.
However, today I started thinking after I followed a random link from a site back to one of these noindex, nofollow pages - am I creating a bunch of pagerank deadends? Say these pages have a few links pointed to them here and there, I believe I am creating a deadend and losing and value from links on these pages to my primary target pages.
Would I be better off using noindex, follow because the spider come to the page and at least get out to the pages I want to rank via internal linking. Or would I only be deindexing these pages while they still would keep their pagerank?
I guess the best way to ask this question is with noindex, follow, does pagerank simply flow through the noindexed page to my followed links without the page itself 'accumalating' any PR?
Edit I just found this older thread: [webmasterworld.com...]
It seems the general opinion is noindex, follow pages do accumulate pagerank so I if I want stop PR from flowing from my primary pages to these pages I need to use the noindex, nofollow and live without the internal links from these pages that may be providing some internal linking boost.
I supposed the best way would to really get granular and make sure internal links that point to these pages on my money pages are nofollowed while internal links pointing out from these pages to the primary pages are dofollowed - ugh.
With blog templates the links are all fairly circular so it would be tough to not to miss a pagerank leak.
I hope the above makes sense. Thoughts? Let me have it..
[edited by: Robert_Charlton at 1:50 am (utc) on Jul 20, 2010]
[edit reason] moved post to this discussion [/edit]
Robert, thanks for moving this post to a relevant discussion. I did not spot it when I asked my question.
whitenight: You wisely speak of imperial evidence and taking what Matt Cutts says with a grain of salt. While he does give some great tips, I believe part of his job is to keep the SEO community off balance. Think about it, there is no way he would ever say "if done properly, pagerank sculpting will help the rankings of your most important pages and this is how to do it".
Anyway, I am only a sample of one and might not be doing my pagerank sculpting perfectly by doing a nofollow, noindex on a bunch of of pages that are not really helping me anymore, but as I mentioned, my key rankings have been in a steady uptrend as the pages are being deindexed. Coincidence, perhaps, you never really know unless repeated because rankings change all the time and sometimes what you did last is not the cause for the effect.
What gave me the idea to cut back my site was tiny Squidoo / Hubpages type sites were starting to move ahead of my much larger site with lots of content that is not as tightly focused as a "Lens".
|But if we keep those listings in noindex, how can it harm the site? |
|I do not expect any help by getting these pages indexed. I am more concerned about any negative effect of these pages being in noindex and links to these pages in nofollow. |
curioustoddler - For purposes of discussion, I've put your last two comments above together.
I've got to confess that I'm not quite following the syntax of your last comment... but my thoughts, which I hope address your concerns, are that putting the listings in "noindex" will not help your site... and I'm not sure why you'd want to them out of the index, even if they were duplicates of other agents' listings.
At worst, these listings simply would not rank for your domain... but the noindex robots tag would assure that. It's remotely possible that one of these listings might rank on your domain for a relevant search and might attract a click in the serps. This isn't what you're pushing, but I don't see how the traffic could hurt you. It might get someone browsing in your site. Why go to any trouble to prevent that unlikely but nevertheless desirable behavior?
I don't think the pages dilute "theme", eg, nor do I think they'll cause Google to somehow lower the quality profile of your site or whatever it is that concerns you.
As long as they're on your site, the pages are going to divert some PageRank. Making them a robots meta "noindex" will not improve that situation.
Re what I think is the crux of your concerns... I don't believe the meta robots "noindex" tag on the page will help more pages to be crawled, but at least it won't prevent PR to the pages from being recirculated in your site, so it's the least intrusive of approaches available. You also should not "nofollow" links from these pages.
My view is that if you want to regain the PR and crawl budget that these pages divert, then the only way you can completely do that is to delete the pages from your site and delete the links to the pages from your navigation.
If you keep the pages on your site, though, you can and should make them "less important"... ie, minimize the PageRank diversion to them by shunting them off to a side subcategory from a location page that's low in your top-down hierarchy.
While this thread has been going on, my indexed pages have gone up three times after putting those listings in noindex. And this is the third time i am seeing this happen.
curioustoddler - Your post appeared just as I was in the process of answering jdancing's question (my next post), which is addressing the "follow" vs "nofollow" issue rather than the "noindex" question.
For the moment, let me say that your results are very intriguing. I'd have to know more about the structure of your site before I could say much more. I'm curious how you were handling the dupe issue before now. From your first post on this thread, I assume that you were not using "nofollow", and that we're both in agreement on that issue.
I've been recommending the "noindex,follow" robots meta tag over the rel="nofollow" link attribute for some time. Offhand, I would still recommend trying a structural fix before I used "noindex". In terms of spidering resources, I'm assuming that Google is using roughly the same crawl effort to process a "noindex" meta tag as it is to spider the page, so I'm not understanding how "noindex" is making better use of your "crawl budget", if that's what's accounting for the difference you're seeing.
The core issue initially in this thread was "follow" vs "nofollow"... but you're adding a very interesting extra ingredient.
jdancing - Regarding your questions about "follow" vs "nofollow"....
|Would I be better off using noindex, follow because the spider come to the page and at least get out to the pages I want to rank via internal linking. |
Yes, your original instincts about the "follow" attribute were correct.
|It seems the general opinion is noindex, follow pages do accumulate pagerank so I if I want stop PR from flowing from my primary pages to these pages I need to use the noindex, nofollow and live without the internal links from these pages that may be providing some internal linking boost. |
I think you overthought this and got it backwards... and the understanding hinges on the meaning of the word "accumulate".
If you mean "accumulate" in the sense of "hold onto" or "hoard" or "drain" from the rest of the site, that's what "noindex,nofollow" does. I think you would be better off using "noindex,follow", per your original thoughts.
As I understand it, "nofollow", either as a rel="nofollow" link attribute or in the meta robots tag, is a PageRank black hole.
Pages with the meta robots "noindex" tag that are spidered by Google have their contents and related data collected but are not shown in the public index. These pages can "accumulate" PageRank, in the sense that Google is aware of the pages and can follow links to them.
It helps here to think of PageRank (and other associated linking factors) as a liquid... link juice... and to think of the link structure as plumbing. Link juice can flow to a "noindex" page.
Depending on the "follow" or "nofollow" attribute, a noindexed page can distribute PageRank to other pages which this page links to, or it can hold onto it. Note that "follow" is default behavior, which means that the link juice can flow from a "noindex" or "noindex,follow" page.
If you use "nofollow", though, essentially you've turned off the links leading out of the page with the meta robots "nofollow" attribute, and link juice doesn't flow through them.
Links with the rel="nofollow" also prevent link juice from flowing through the specific links the attribute is on.
Opening the outbound links on a page does not mean that you empty the page of PageRank. This is a commmon misconception which can lead to PageRank hording. In the PageRank formula, PageRank is viewed as constantly recirculating throughout the site and through the web. PageRank is a measure of popularity of a page... of all the combined link juice recommendations that flow through the web to this page. It's not a measure of how much PageRank is left on a page after you've linked out. Thus, the word "accumulate" is misleading, at least with regard to a single page.
What the meta robots "follow" attribute does is to allow the PageRank to continue circulating, so the page isn't walled off from the rest of the site.
Robert thanks for the very clear explanation. I have switched these pages to noindex, follow.
Hopefully all this monkeying around doesn't get the google gods angry.
Thanks Robert. Actually i never thought of pagerank and my motive was to keep gbot away from those listing pages, i did not even consider whether they should be nofollow or follow. I just used nonidex. Now i realize that accidently i did something right.
|my motive was to keep gbot away from those listing pages... I just used nonidex... I did something right |
Just for clarity, your are NOT keeping gbot away. It's happily munching its way through the NOINDEX pages, but will keep them out of SERPs.
To keep gbot off your pages, use robots.txt.
If you block pages in robots.txt, links will NOT be followed bacause they will not be found. If you NOINDEX a page, the links WILL be followed, and PR & PR-like data (relavance, semantics) will flow naturally.
The site i have been talking about got minus fifty penalty. For every keyword, even the sitename( without .com) the is now ranking from 5th to 7th page. During this month i have just added google maps to the site and this nonidex. Can putting listing in noindex cause this minus fifty penalty?
What do i need to check and do now?
What about affiliate links? with a internal transition page..i.e "please wait while we connect you"
We recently told G to nofollow these links because we spotted some in the index even through they were blocked in robots.txt.... the pages jumped from page 1 to page 3 on the next G crawl. Plus G now seems confused about which page to serve.
After 8 hours site has come back at google.co.uk with all pages ranking as before. But at other google sites this is still -50.
Did you do anything? like revert back?
no i did not do anything. i am still check and trying to understand what went wrong. There are no paid links on site. I never baught any links. And i doubt if putting those listings in noindex can cause this.
site is not appearing at 51 even for sitename( without .com)
|Can putting listing in noindex cause this minus fifty penalty? |
Not that I'm aware of.
|While this thread has been going on, my indexed pages have gone up three times after putting those listings in noindex. And this is the third time i am seeing this happen. |
Again, if this is cause and effect, this is interesting news. Apparently, though, the outcome of what you've done isn't quite yet clear.
I haven't felt that noindex would have anything to do with your crawl issues, and if it did expand your crawl budget, that would be new and useful information to me. I still think crawl budget is a structural issue, not to be controlled either by noindex or nofollow. I was strongly suggesting, though, that nofollow would most likely screw up the internal flow of your PageRank.
I'm wondering whether you by any chance have also used robots.txt to keep these listing pages out of the index. IMO, robots.txt can cause a lot of problems in the wrong hands, and it's definitely not my method of choice for sculpting PageRank.
Pages blocked by robots.txt can be a black hole for PageRank. As Shaddows points out, links from such blocked pages will not be followed because Google doesn't see them. Blocking a page by robots.txt will also disable the robots meta tag for the page it's on.
If you have used robots.txt to get rid of the listing pages, this may be a source of your problems. Otherwise, you might want to look at other issues on your site and to start another thread about those particular problems. I don't see how it could be a noindex problem.
PS: I see you have started a new thread about your -50 problems.... [webmasterworld.com...]
|We recently told G to nofollow these links because we spotted some in the index even through they were blocked in robots.txt |
Neither of the methods you used will keep the pages out of the index. Google has the URLs and using a robots.txt block keeps them from crawling them, so they take any links without 'nofollow' pointing to it and other information they can relate to try and figure out what the page is about, then often include the page in the index, because they can't find out if the page is important or not by visiting it and don't want to not include a page their visitors expect to find...
There are ways to get the pages out and keep them out, and one of the simplest is to use the removal tool for the URLs now listed, then change the links to point to a single URL that redirects to the page based on the link clicked and have that URL blocked in the robots.txt and the links nofollowed like you do now... If anything shows in the index it will only be the URL of the redirect page and if it's dynamic, you can put a simple 'nothing to see here' page up if a link is not clicked to visit, so a direct visit yields nothing or you could even put a small 'you might be interested in' sitemap on it for direct visits.
I did not block any pages by robots.txt and i have started a new thread.
Are you sure noindex can not cause a sitewide -50 penalty?
| This 58 message thread spans 2 pages: < < 58 ( 1  ) |