Welcome to WebmasterWorld Guest from 3.234.210.89

Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

Removing Content Pages From Google Index

     
7:02 pm on Nov 11, 2007 (gmt 0)

Preferred Member

10+ Year Member

joined:Oct 7, 2002
posts: 403
votes: 0


Dear Friends/Experts,
I'd like to remove some dublicated content pages from google index. Is it enough to ban the mentioned pages from robots.txt and add
<META NAME="ROBOTS" CONTENT="NOINDEX, NOFOLLOW">

to the pages?

Also what else can I do to avoid some new pages being indexed by google and any other search angines?

In advance thank you for any comments. :)

7:19 pm on Nov 11, 2007 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Apr 9, 2005
posts:1509
votes: 0


To remove pages immediately, sign up for Google Webmaster Tools and use the URL removal link.

To remove pages as you indicated, you need to use only one method... If you disallow in robots.txt, the page(s) disallowed will not be accessed again, so the 'noindex' meta tag will not be seen, making it ineffective.

If possible, I think the better method is to redirect the duplicated pages to a single set of pages, so you will gain the benefits of any inbound links, but if redirecting is not an option, either method you suggested should be effective, both for removing pages and keeping new pages out of the index.

Justin

7:31 pm on Nov 11, 2007 (gmt 0)

Preferred Member

10+ Year Member

joined:Oct 7, 2002
posts: 403
votes: 0


Justin Thank you very much for the reply. Will robots.txt baning and <META NAME.... work for other search engines too or just Google?

Are any ways to hide just part of the pages? Such as some links or part of content, I know about "no follow", but are there other ways to hide links/content complitely and for all search engines?

7:42 pm on Nov 11, 2007 (gmt 0)

Senior Member

WebmasterWorld Senior Member tedster is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:May 26, 2000
posts:37301
votes: 0


One good aproach to hide part of a web page from indexing is to put that content on a separate url, one that you disallow in robots.txt. Then you can display that content in an iframe on the original page for your human visitors there, but search engines will never see it.

But there is no html mark-up that disallows indexing for just part of a document.

7:52 pm on Nov 11, 2007 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Apr 9, 2005
posts:1509
votes: 0


Banning in robots.txt will work for any standards compliant robot / spider, so it should work for all (or nearly all) commercial search engines. The robots meta tag should also work in the major SEs, but may not be use by some of the smaller ones --- I do not know personally, because I usually focus on the 'big 3'.

There are some ways to keep portions of pages from being 'seen'. The most widely used are JavaScript and/or iFrames, but testing and determining which of these works best for you is usually up to you.

Keep in mind any type of 'hiding text' can be considered cloaking and/or spamming, so you really have to make your own determination, and use caution / discretion when implementing any system which shows different information to visitors and search engines.

I would suggest doing quite a bit of research, so you know the risk / reward prior to attempting to hide information... Also, keep in mind the way things are treated today could change tomorrow, and what was 'not seen' today, might be 'seen' as a 'red flag' in the near future.

Justin

11:45 am on Nov 12, 2007 (gmt 0)

Preferred Member

10+ Year Member

joined:Oct 7, 2002
posts: 403
votes: 0


Ted, Justin
Thank you very much :)
3:28 pm on Nov 12, 2007 (gmt 0)

Junior Member

10+ Year Member

joined:Sept 24, 2007
posts: 51
votes: 0


We removed content, basically all of our pages, 5000 +, using
the URL removal tool. Worked great. Then we removed all but 14
pages from our server. Two weeks later, we cancelled the Url
removal request, in the hope that just the 14 pages would be
eventually reindexed. Much to our dismay, most of the pages
not even on our server anymore, have reappeared in the Serps.
Honestly, I think the Url removal tool, should be renamed
the Url "hold" tool, as it does not really appaer to remove
anything permanently.
10:00 am on Nov 13, 2007 (gmt 0)

Preferred Member

10+ Year Member

joined:Oct 7, 2002
posts: 403
votes: 0


DannyTweb, thanks you for comments. If you have any other tips for content removal please post.
1:57 pm on Feb 14, 2008 (gmt 0)

Senior Member

WebmasterWorld Senior Member quadrille is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:Feb 22, 2002
posts:3455
votes: 0


You cannot control Google. You can ask for a removal (usually unwise, if you want visitors), but doing this as a 'trick' to try and force Google to update their database is doomed.

The 'gone' pages will fall out eventually, meanwhile, just be sure that the 'new' pages are better, and therefore more likely to appear in the serps.

The fact that dead pages 'can' be found does not mean that they will (by the average searcher) - try a few keyword searches and you'll see.