Welcome to WebmasterWorld Guest from 54.144.246.252

NoIndex Clarification

Trying to Stop Duplicate Content Indexing.

   
9:24 pm on Sep 27, 2006 (gmt 0)

10+ Year Member



We are trying to stop any duplicate content caused when page parameter is missing from the URL.

We setup a vb sub that will check for these parameters in the URL. If these parameters are MISSING it will display <META NAME="Robots" CONTENT="noindex"> If the parameters are INCLUDED in the url it will display <META NAME="Robots" CONTENT="index, follow">

Will this stop ONLY the unwanted url from being listed in google, or will this cause the entire page file to be dropped?

After reading many posts by g1smd, I think this setup will work, but this seems important enough to ask before implementation.

10:24 am on Sep 28, 2006 (gmt 0)

10+ Year Member



We are using the "noindex, no follow" tag on may things. Things will eventually delist. The keyword is "eventually". Not sure how long it actually takes. (Weeks? Months?) -- I would guess that it will be months.
10:35 am on Sep 28, 2006 (gmt 0)

WebmasterWorld Senior Member g1smd is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



If the URLs are in the normal index they will get dropped in just a few weeks.

If they are Supplemental, they will take a lot longer to disappear, but Google will get rid of them eventually. It may be months. It might be a year.

Only the URLs that serve a noindex tag will be dropped. Others will remain.

10:37 am on Sep 28, 2006 (gmt 0)

10+ Year Member



Hi Kelcor

From my experience that will work quiet well. Anyway - if you are able to find another solution it might be better. Googlebot has to read all of your documents first. And then they need to be processed. That takes some time...

If some of your documents are already marked as supplemental it may take a very long time to remove them.

Do you have lots of urls?

Regards

itloc

10:48 am on Sep 28, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I'm doing pretty much the same thing, except that I'm trying to stop any duplicate content caused when a parameter is present in the URL. I want only the plain URL to be indexed and my pages throw up the noindex meta when there's a parameter.

However, I've noticed that Googlebot is crawling the parametered URLs on a daily basis. The pages aren't indexed, as they were done this way from new, but I'm surprised at the regular ongoing crawling of pages with 'noindex'.

MSN, incidentally, has completely ignored the noindex and has indexed all the parametered URLs.

[edited by: Patrick_Taylor at 10:51 am (utc) on Sep. 28, 2006]

10:51 am on Sep 28, 2006 (gmt 0)

WebmasterWorld Senior Member g1smd is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



Once Google knows about a URL they will crawl it forever looking to see if the status they have for that URL is still correct.

It must work that way, otherwise changes that you make will never be picked up.

What they crawl is a larger number of URLs than what they index content for. What they index is a larger number of URLs than they show in the search results.

10:55 am on Sep 28, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



It must work that way, otherwise changes that you make will never be picked up.

Yes, I suppose so. Thanks. The odd thing is that these are the pages most frequently crawled (at present).

 

Featured Threads

Hot Threads This Week

Hot Threads This Month