homepage Welcome to WebmasterWorld Guest from 54.243.13.30
register, free tools, login, search, subscribe, help, library, announcements, recent posts, open posts,
Accredited PayPal World Seller

Visit PubCon.com
Home / Forums Index / Google / Google SEO News and Discussion
Forum Library, Charter, Moderators: Robert Charlton & aakk9999 & brotherhood of lan & goodroi

Google SEO News and Discussion Forum

This 57 message thread spans 2 pages: 57 ( [1] 2 > >     
How Does Google Treat No-indexed Pages?
aristotle




msg:4296797
 11:08 am on Apr 13, 2011 (gmt 0)

Adding a noindex meta tag to a page's header causes Google to remove the page from its search index. Thus the page will no longer appear in Google's SERPs.

But people can still visit the page, so even though it's no longer indexed, it still contributes content to the site. What I'm wondering is whether this content can affect the rankings of other pages on the site that are still in the index.

In other words, since people can still visit no-indexed pages, does the Google algorithm include them in its evaluation of the overall content and quality of the site?

 

Pjman




msg:4296816
 11:40 am on Apr 13, 2011 (gmt 0)

When I was hit by Panda, I noindex,nofollowed, and robot.txt blocked all the thin content. I'm removing the thin content and adding to some salvageable pages.

After much debate here and seeing Matt Cutts And Singhl advocate for noindex only, switched to that. I mean if the engineers are telling us to do it that way, it should work.

pageoneresults




msg:4296826
 11:57 am on Apr 13, 2011 (gmt 0)

I noindex,nofollowed, and robot.txt blocked all the thin content.


Hopefully one or the other? Googlebot will not see the noindex, nofollow if those documents are Disallowed via robots.txt. You have to use one or the other, not both.

In other words, since people can still visit no-indexed pages, does the Google algorithm include them in its evaluation of the overall content and quality of the site?


Yes. As long as it is just noindex. As soon as you add nofollow to the mix, they are dead end pages. Careful with the use of noindex, nofollow. Using just noindex is fine and recommended to keep documents out of the index. Not out of the equation, only out of the index.

TheMadScientist




msg:4296843
 12:33 pm on Apr 13, 2011 (gmt 0)

...Matt Cutts And Singhl advocate for noindex only...

Do you have a reference for that? I missed must have missed it ... Could have been in one of the many similar articles I skimmed, but I'd like to see what else I might have missed.

And to reiterate and expand on what P1R said, if you have both the robots.txt will always take precedence, so the pages (often as url only) may be included in the index.

TheMadScientist




msg:4296972
 4:53 pm on Apr 13, 2011 (gmt 0)

I guess I should answer the original question too...

In other words, since people can still visit no-indexed pages, does the Google algorithm include them in its evaluation of the overall content and quality of the site?

That's a really good question ... I don't have, and haven't read, any data about this since the Panda change ... I really don't want to make any guesses right now, because that's all it would be.

Shatner




msg:4297047
 6:49 pm on Apr 13, 2011 (gmt 0)

>>When I was hit by Panda, I noindex,nofollowed, and robot.txt blocked all the thin content. I'm removing the thin content and adding to some salvageable pages.

I did that too, and by thin content I mean things like tag listing pages, search pages, etc. Not really thin content, but not really stuff Google needed to be indexing anyway. But that was the only thing I could find that could possible be causing my Panda problem.

Yesterday in Panda 2.0 I lost another 25% of my Google traffic. So clearly that was the wrong move.

Pjman




msg:4297072
 7:24 pm on Apr 13, 2011 (gmt 0)

Blocking “Low Quality” Content

Matt reiterated that enough low quality content on a site could reduce rankings for that site as a whole. Improving the quality of the pages or removing the pages altogether are typically good ways to fix that problem, but a few scenarios need a different solution.

For instance, a business review site might want to include a listing for each business so that visitors can leave reviews, but those pages typically have only business description information that’s duplicated across the web until visitors have reviewed it. A question/answer site will have questions without answers… until visitors answer them.

In cases like this, Google’s Maile Ohye recommended using a <meta name=robots content=noindex> on the pages until they have unique and high-quality content on them. She recommends this over blocking via robots.txt so that search engines can know the pages exist and start building history for them so that once the pages are no longer blocked, they can more quickly be ranked appropriately.



From: Lessons Learned at SMX West: Google’s Farmer/Panda Update, White Hat Cloaking, And Link Building

Mar 12, 2011 at 1:36pm ET by Vanessa Fox

TheMadScientist




msg:4297074
 7:29 pm on Apr 13, 2011 (gmt 0)

Thanks, I do remember reading that now ... Ugh ... Too much info all at once with Panda to keep up with everything ... Probably need to re-read it, because I don't remember if there's anything else I don't remember! lol

aristotle




msg:4299088
 7:47 pm on Apr 16, 2011 (gmt 0)

I'm still wondering about this. The above quote from Vanessa Fox doesn't explicitly say that no-indexed pages are totally disregarded. And since people can still visit them, Google might still consider them to be part of the site's content.

Also, when I strted this thread, I wasn't thinking about the Panda update, but about Google's general treatment of no-indexed pages. But the posters who mentioned Panda said that adding no-index tags to some pages didn't seem to help their overall rankings.

So I think the question is still open.

tedster




msg:4299094
 7:52 pm on Apr 16, 2011 (gmt 0)

I agree that it is still an open question - however no one is recovering from a Panda problem with any approach, so I don't know what that tells us about noindex in particular.

docklands




msg:4300128
 11:36 pm on Apr 18, 2011 (gmt 0)

I'm facing similar problem too. I run a website about hotels in a certain country. All of the hotels listed have their own html page with detailed description, contact details, location, picture gallery...On each page I have a "contact now" and "read comments" buttons that link to php pages of "low quality" but useful for the users. I can't delete those low quality pages as they are useful part of my site but I can noindex them, but in case they still count for site quality a noindex tag wouldn't do any good, would it?

What would you suggest as a best solution? I'm thinking of changing the main hotel pages to php and putting the contact and comment forms on them and then 301 redirect the contact and comment pages to the main (php) one and the main html to main.php.

The idea was to keep the main part of the site in html but obviously if I add comments and contact forms to the hotels pages I'll have to change to php. Will switching from html to php harm my serps?

tedster




msg:4300132
 11:47 pm on Apr 18, 2011 (gmt 0)

Welcome to the forums docklands.

Will switching from html to php harm my serps?

If you change the file extension, yes. But you don't need to changel the file extension. See this discussion from two weeks ago: Will switching from html to php harm my serps? [webmasterworld.com]

walkman




msg:4300216
 4:27 am on Apr 19, 2011 (gmt 0)

Read this from Matt in 2008 [mattcutts.com...]
Matt Cutts – Google Not prepared, but informal remarks. High order nits: what do people worry about? He often finds that honest webmasters worry about dupe content when they don’t need to. G tries to always return the “best” version of a page.

and compare to what we think Panda does. Maybe a few months after losing our shirts we may find out that 404 should have been used

docklands




msg:4300694
 8:31 pm on Apr 19, 2011 (gmt 0)

Hi again,

I wasn't able to parse the html as php yet (have tried every .htaccess modification with no luck) but have addressed the issue to my hosting provider and am waiting for reply. So lets say I'll be able to keep the html pages and put contact forms and comment forms on them. The question now is: what to do with the pages which the contact and comments forms were on? Shall I 301 redirect them to the page I moved them to or go for 410 Gone? If I 301 them, they will still be part of my website and still contribute for poor quality (or maybe not?). The site is new and there aren't any backlinks to the contact and comments pages.

indyank




msg:4300989
 3:22 am on Apr 20, 2011 (gmt 0)

I think this is a very important question for google to address, as several people either block via robots.txt or use noindex meta tag or do both.

Many who look for advice from Google for recovering from Panda trust their words and use noindex meta tag as this has been highly recommended by Google’s Maile Ohye and others.

Since not many are reporting to have recovered, this question is very important for Google to address.

indyank




msg:4300991
 3:29 am on Apr 20, 2011 (gmt 0)

If you use "noindex" and not "noindex, nofollow" google has to parse the text to find the links.

In this case, it is not just the users but the bots might also see the content.

Will this lead to quality evaluation?

epmaniac




msg:4300994
 3:38 am on Apr 20, 2011 (gmt 0)

>>I wasn't able to parse the html as php yet (have tried every .htaccess modification with no luck)

Docklands, you cant parse the html as php, what you can do is make php pages and rewrite them as html.

docklands




msg:4302952
 1:57 pm on Apr 23, 2011 (gmt 0)

I decided to set 410 redirects to all contact and comments pages although I'mm move the contact and comment forms on the hotels main pages. Site is still new and still haven't got a good position in serps but will update if I see any change. The question is now - shall I remove those pages from the sitemap or leave them so search engines can exclude the from their index quicker?

walkman




msg:4302957
 2:10 pm on Apr 23, 2011 (gmt 0)

"shall I remove those pages from the sitemap or leave them so search engines can exclude the from their index quicker? "

LEAVE them, Google needs to visit them a few times so they see that they are gone.

tedster




msg:4303194
 4:30 am on Apr 24, 2011 (gmt 0)

But don't leave them in your sitemap indefinitely - that's one way googlebot might continue to check frequently, long after requests could be dropped to a much less frequent cycle.

docklands




msg:4303365
 5:50 pm on Apr 24, 2011 (gmt 0)

Here is an update - the site used to be in position #18 for a "hotels in *****" search for few weeks. It's now not showing in the first 1000 results:) Along with the 410 redirects I redesigned the homepage but kept the navigation, wrote few unique descriptions about some destinations in the particular country and included one more very competitive keyword but I think the new design is much more user and SE (G) friendly but who knows, that might be causing the site to disappear (happened to me before after a small change in the design of a high ranking website, eventually it gained it's positions few weeks later although now seeing position drop). Please note the site is only a month and a half old.

PS - by the way, I've changed the encoding from Windows-1252 to UTF-8.

aristotle




msg:4303395
 7:20 pm on Apr 24, 2011 (gmt 0)

Quite often when you make sudden major changes to a site, Google will need time to re-evaluate it, and in the meantime will drop its rankings. A full recovery could take several weeks or even longer.

I also think that major revamps might have a long-term effect on how much trust Google gives to a site.

I just tried to make a list of things that might affect how much Google trusts a site:

-- large number of redirects

-- large number of blocked pages

-- large number of deleted pages

-- new pages being added at a rapid rate

-- content is frequently revised or re-written.

I think all of these could have a long-term negative effect. Of course I can't prove it, rather it's just speculation on my part.

incrediBILL




msg:4303396
 7:30 pm on Apr 24, 2011 (gmt 0)

@pageoneresults said:
Hopefully one or the other? Googlebot will not see the noindex, nofollow if those documents are Disallowed via robots.txt. You have to use one or the other, not both.


Not true!

Both is double protection, a technique I highly recommend.

I do both robots.txt and NOINDEX as a fail-safe which recently saved my bacon when I made a small mistake updating robots.txt. Thousands of pages would've been indexed in days and it takes forever to get rid of the mess. However, the redundant NOINDEX stopped Googlebot from making a big mess in the first place.

docklands




msg:4303414
 8:20 pm on Apr 24, 2011 (gmt 0)

@aristotle

I totally agree with you, but shouldn't google be aware that many webmasters will now delete pages and set lots of redirects in effort to improve their sites after the panda?

aristotle




msg:4303421
 8:52 pm on Apr 24, 2011 (gmt 0)

docklands -
I'm sure that some people at Google are aware of the situation. But making exceptions for sites hurt by Panda might require them to add some special provisions to the algorithm, and I don't know if they would be willing to do this.

Also, I may have misled you with my speculations. I do think major revamps can have a long-term negative effect, but I should have said that it probably slowly fades away. Someone who gets in trouble as a teenager still has a chance to have a successful life. In the same way a website can eventually recover from negative events.

docklands




msg:4303886
 11:46 pm on Apr 25, 2011 (gmt 0)

Just an update, Redirected pages now begin to show in WMT as 404 (Not found), Linked From - unavailable.

aristotle




msg:4304029
 11:00 am on Apr 26, 2011 (gmt 0)

Just an update, Redirected pages now begin to show in WMT as 404 (Not found), Linked From - unavailable.


I'm not sure what you mean by this. The server shouldn't return a 404 for a properly-done redirect. Maybe you should check to make sure you did the redirects correctly.

incrediBILL




msg:4304180
 4:08 pm on Apr 26, 2011 (gmt 0)

Just an update, Redirected pages now begin to show in WMT as 404 (Not found), Linked From - unavailable.



Please don't start unsubstantiated rumors.

Keep in mind that Google crawls while you're doing site maintenance and a 1 second mistake can easily show up in WMTs

Google also shows erroneous crawl results, I see that stuff all the time for inexplicable reasons.

docklands




msg:4304413
 9:08 pm on Apr 26, 2011 (gmt 0)

Sorry if I didn't make myself clear. I wanted to say that I've set 410 redirects to all of my "thin pages". The redirects are set properly and when I try to reach a 410 redirected page I get "Gone...is no longer available on this server and there is no forwarding address. Please remove all references to this resource.

The thing I wanted to say is that the pages now start appear one by one in the error log as 404 (Not found) in webmaster tools and not as 410 which is a bit weird to me. Haven't used 410 in the past so don't know if that's normal.

emmab21




msg:4305295
 11:06 am on Apr 28, 2011 (gmt 0)

I know this post wasn't started with the intention of just applying to post panda treatment of noindex but since it's kind of gone down this track I wanted to ask;
I read a post that, amongst other thing said: "2) Move any bunk content off-site"
I was previously noindex-ing these pages but now I'm not sure this is the right thing to do. Has anyone else tried moving content off-site with notable results?

This 57 message thread spans 2 pages: 57 ( [1] 2 > >
Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Google / Google SEO News and Discussion
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About
© Webmaster World 1996-2014 all rights reserved