Welcome to WebmasterWorld Guest from 34.204.173.36

Forum Moderators: Robert Charlton & goodroi

No index, no follow meta tags on blog archive pages?

     
8:15 pm on Jun 10, 2019 (gmt 0)

Junior Member

Top Contributors Of The Month

joined:Jan 8, 2019
posts: 93
votes: 2


Hi guys,

Thanks to everyone who has helped me since I joined WebmasterWorld! It's much appreciated!

So, when I spider my site using Screaming Frog, it only finds 22 blogs (we have 500+).

When I turn on the filter FOLLOW INTERNAL NO-FOLLOW then SF finds all my blogs.

Turns out all my blog archive pages (pg 1, pg 2, etc) have "no index, no follow" tags.

SF was only able to find blogs that are linked to from other pages besides my blog archive pages.

My question is, if SF wasn't spidering those 500+ blogs, does that mean Google isn't crawling them?

Also, will the No Follow tag stop PageRank from passing from homepage to blog archive page on to the blogs themselves?

Just wondering if the No Index, No Follow tag on my archive pages is a problem in anyway.

PS Looks like all my blogs are indexed despite SF not being able to crawl them.
10:14 pm on June 10, 2019 (gmt 0)

Senior Member from GB 

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month

joined:Sept 7, 2006
posts: 1137
votes: 140


NOINDEX is a PAGE TAG: if a page has a noindex meta-tag, Google and other bots that obey the tag - not all do - won't index it.

NOFOLLOW is both a PAGE TAG and and a LINK ATTRIBUTE: if a page has the NOFOLLOW tag (<META NAME="ROBOTS" CONTENT="NOFOLLOW">), Google won't follow links on that page; if a link has the NOFOLLOW attribute (<a href="anypage.htm" rel="nofollow">) , Google will not follow that link. However, Google may still follow other links to the target page that don't have the NOFOLLOW attribute, so NOFOLLOW doesn't guarantee that a page won't be crawled, and has no effect on whether it is indexed.

Use NOFOLLOW in the HEAD section of a page when you don't want Google (or other robots) to follow any of the links on that page. Use NOFOLLOW as a link attibute on any individual link you don't want Google to follow.

Use NOINDEX (<META NAME="ROBOTS" CONTENT="NOINDEX">) in the HEAD section of a page you don't want indexed.

Your decisions about which tags/attributes to use where on your own site should be informed by what you want bots to do (or not to do), not by any SEO objectives.

NOINDEX in the page HEAD will prevent that page from being indexed, so if you want that page indexed, don't use it.

Screaming Frog is only spidering internal links. If all internal links to a page have the nofollow attribute then it probably won't get crawled by Google unless there are third-party links to it.

Noindex will have no effect on links, as it isn't a valid link attribute. If you want pages indexed and crawled, ensure you have at least one internal link to them without the nofollow attribute, and do not use noindex on them.

[edited by: Wilburforce at 10:31 pm (utc) on Jun 10, 2019]

10:30 pm on June 10, 2019 (gmt 0)

Administrator from US 

WebmasterWorld Administrator not2easy is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Dec 27, 2006
posts:4558
votes: 363


I have some difficulty following the question only because I think of a "blog" as being a website whose content is posts. I am pretty sure you are not asking about multiple websites (500+) on a website. Are "blogs" similar to individual posted entries for this question? If so, it seems normal that the site's archive folders are not usually indexed in addition to their other taxonomy.

You would not want to submit every instance of an article separately or you will have multiple ways to reach the same content. It is normal for these various ways to reach the same article to exist but you don't want to submit all copies for indexing, that would be duplicate content. If the same content is found at example.com/blog/widgets/blue-widgets/ and also at example.com/blog/archives/blue-widgets/ you don't count that as two separate articles.

I can understand the no index tag but don't see a reason for nofollow.
Also, will the No Follow tag stop PageRank from passing from homepage to blog archive page on to the blogs themselves?
I have never seen any information about using "nofollow" for that purpose.

11:26 pm on June 10, 2019 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:Apr 9, 2011
posts:15934
votes: 889


Turns out all my blog archive pages (pg 1, pg 2, etc) have "no index, no follow" tags.
It sounds as if SF is being silly and should cut it out. The “nofollow” tag doesn’t mean “pretend you didn’t see this link”. It simply means “don’t tell them I sent you”.
8:13 am on June 11, 2019 (gmt 0)

Senior Member from GB 

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month

joined:Sept 7, 2006
posts: 1137
votes: 140


It simply means “don’t tell them I sent you”


As far as Google is concerned, it means "don't follow this link" ([support.google.com ]).
11:39 am on June 11, 2019 (gmt 0)

Junior Member

Top Contributors Of The Month

joined:Apr 25, 2019
posts:87
votes: 42


There is no reason to index archive pages, which can create duplicate content. The archive pages are set up for navigation purposes once a reader is on the site. E.g. no one is searching for page 100 of site "X" and there is no reason to have it indexed when all those articles are indexed individually.
2:29 pm on June 11, 2019 (gmt 0)

Senior Member from GB 

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month

joined:Apr 30, 2008
posts:2630
votes: 191


Turns out all my blog archive pages (pg 1, pg 2, etc) have "no index, no follow" tags.

If your pagination is the only way to reach some articles (no other internal or external links to a blog article) then Google will not "see" it (it will but it **may** ignore it). There will also be no link juice flowing to these article pages from the rest of your site.

I would remove "nofollow" from paginated pages (you can leave "noindex").
I would also check via site:example.com to see what pages Google has in its index.
3:11 pm on June 11, 2019 (gmt 0)

Senior Member

WebmasterWorld Senior Member Top Contributors Of The Month

joined:Apr 1, 2016
posts:2738
votes: 837


Most of what has been stated above is spot on, but I would like to push back on this statement by @willburforce
Noindex will have no effect on links, as it isn't a valid link attribute.

and by @aakk999:
I would remove "nofollow" from paginated pages (you can leave "noindex").

While it is technically true that noindex does not have a direct affect a on links, there is an unfortunate side-effect to using "noindex", that is if the page is not in the index, then the links are also not in the index. Thus, the links on the not-indexed page are not being counted towards page-rank. This issue was pointed out a few month or maybe a year ago by John Mueller and it sparked quite the controversy. See the discussion here:
[webmasterworld.com...]
(a year ago nearly to the day... how time flies)

Also it is worth noting, but may not apply here. The setting of the noindex and nofollow directive can be set in the server response headers using "X-Robots-Tag" header. In the event that one doesn't see the meta tags in the HTML mark-up of the page, the directive is potentially being set in the .htaccess or by others means in the response headers.
4:13 pm on June 11, 2019 (gmt 0)

Senior Member from GB 

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month

joined:Sept 7, 2006
posts: 1137
votes: 140


@NickMNS

I agree, but I wasn't talking about links from a noindexed page. I meant <a href="anypage.htm" rel="noindex, nofollow"> isn't valid syntax, so noindex used in that way will have no effect.
6:35 pm on June 11, 2019 (gmt 0)

Junior Member

Top Contributors Of The Month

joined:Jan 8, 2019
posts: 93
votes: 2


@Everyone

Thank you so much for all the advice!

@NickMNS

If No Index does prevent any links on the no indexed page from being indexed, can't I build interlinks from other related pages to my blogs so Google can still spider them?

In other words:

1) Leave No Index
2) Build interlinks to blogs
3) Remove No Follow
7:01 am on June 12, 2019 (gmt 0)

New User

joined:Apr 23, 2019
posts:1
votes: 0


Simple rule: Never ever use 'nofollow' on internal links and pages! I learned that the hard way.