homepage Welcome to WebmasterWorld Guest from 54.205.207.53
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Home / Forums Index / Google / Google SEO News and Discussion
Forum Library, Charter, Moderators: Robert Charlton & aakk9999 & brotherhood of lan & goodroi

Google SEO News and Discussion Forum

    
Using Webmaster Tools URL Removal Tool
speedshopping

5+ Year Member



 
Msg#: 4534660 posted 2:36 pm on Jan 9, 2013 (gmt 0)

Hi,

We have recently got into a mess with our 301 redirects to the point where our URL re-written destination page has been duplicated up to 6 times across several sub domains.

We are pretty sure that our plight has resulted in us getting a 950 penalty across multiple subdomains.

We are in a position to use the directory removal tool in Webmaster Tools, but on one specific domain, we need to re-open a directory to allow Google to index the pages that we want to keep as soon as it has removed the pages directory.

To summarise,

1) We have 10000 URLs inside /dir/ that are duplicated but also have 2000 that are genuine.

2) We want to remove the /dir/ directory so that it removes them all quickly.

3) We then need to re-allow Googlebot to come and re-crawl the 2000 genuine URLs. The 10000 URLs (should Googlebot revisit them) will have canonical tags to the correct URL destination, thus keeping the duplicate content out.

Is this something that WTM removal tool allows?

Cheers guys.

Wesiwyg

 

aristotle

WebmasterWorld Senior Member 5+ Year Member Top Contributors Of The Month



 
Msg#: 4534660 posted 9:58 pm on Jan 9, 2013 (gmt 0)

I'm not sure I understand your question, but as a general rule I think you need to either delete (404 or 410) or noindex the pages before you submit them to Google's removal tool. After they are submitted to the tool, in my experience Googlebot usually crawls them within a few hours, and they are gone from the index within 24 hours.

speedshopping

5+ Year Member



 
Msg#: 4534660 posted 10:33 pm on Jan 9, 2013 (gmt 0)

we can easily robot out the /dir/ and delete it within a few hours, thats not the problem.

This /dir/ directory has 2,000 URLs we want re-crawling immediately after removing it, it is collateral damage for removing the duplicate content, so I want to know whether Google will re-crawl them soon after removing the "disallow" in robots, and secondly will these re-crawled URLs pass the same amount of link juice onto others now that they have been deleted and re-indexed?

Cheers

TheMadScientist

WebmasterWorld Senior Member themadscientist us a WebmasterWorld Top Contributor of All Time 5+ Year Member



 
Msg#: 4534660 posted 10:39 pm on Jan 9, 2013 (gmt 0)

I'm pretty sure the URL removal tool removes them for 90 days minimum, so I doubt if I'd go that route ... I think I'd just get the redirects right from the duplicates to the pages you actually want to keep and don't 'over do' things or 'over manage' it.

You have duplicates ... They've seen that for years ... Generally, they'll just 'pick a version' to use if there's not a redirect or canonical link relationship to 'give them an indication' of which should be used as the canonical.

It may be a bit different since the duplication is across subdomains, rather than a single domain, but that's also the 'technical case' when a domain is available both with and without the www, so I think I'd just get it right with the 301 redirects and not 'get fancy' with it or 'over do' trying to fix it.

ZydoSEO

WebmasterWorld Senior Member 5+ Year Member



 
Msg#: 4534660 posted 10:57 pm on Jan 9, 2013 (gmt 0)

A Disallow: directive in the robots.txt is not necessarily going to get them removed from Google's index any time soon. It will simply mean that they will no longer crawl them. A meta robots noindex would get them removed but does not seem appropriate since there are so many.

I would probably just 301 redirect the 8000 or so URLs since they were likely only linked to from your site (and I'm assuming all such links have been fixed).

Hopefully, you can do some pattern matching in Mod_Rewrite/.htaccess so that you don't need 8000 page-by-page redirects.

speedshopping

5+ Year Member



 
Msg#: 4534660 posted 11:49 pm on Jan 9, 2013 (gmt 0)

If the 90 day minimum removal stands this may be a show stopper because the non dupe urls have a canonical to the correct place on other subdomains - without google re-crawling, indexing and honouring the canonical it would stop the link juice flowing and could affect ranking...the reason for opting for the removal tool is to resolve this issue quickly - as mentioned most of our traffic went 950 and we have proof that its because of this cross subdomain dupe content...

We have dropped the 301 redirects because somehow google has managed to get into this mess

TheMadScientist

WebmasterWorld Senior Member themadscientist us a WebmasterWorld Top Contributor of All Time 5+ Year Member



 
Msg#: 4534660 posted 11:55 pm on Jan 9, 2013 (gmt 0)

A Disallow: directive in the robots.txt is not necessarily going to get them removed from Google's index any time soon. It will simply mean that they will no longer crawl them. A meta robots noindex would get them removed but does not seem appropriate since there are so many.

When used in conjunction with the removal tool, it's immediate ... You either have to put a block in the robots.txt or a noindex on the pages to use the tool successfully.

We have dropped the 301 redirects because somehow google has managed to get into this mess

I'd double check on the 90 minimum. I haven't looked into it in a while, but that's what it used to be.

I would guess there was some issue with the redirecting previously to have had issues, because, except for having a larger 'not selected' list the URLs would not be accessible, except for the canonical version, so there should not have been duplication ... Maybe I'm missing something or what the initial issue was causing the 301s to be removed?

speedshopping

5+ Year Member



 
Msg#: 4534660 posted 12:13 am on Jan 10, 2013 (gmt 0)

In hindsight we should have placed a canonical in addition to the 301 to make sure we were covered - this 90 day limit looks for real, so looks like we are going to have to sit tight while google gets through these urls... I am shocked google has 950d us based on dupe content, but it may have tipped us over the edge...

TheMadScientist

WebmasterWorld Senior Member themadscientist us a WebmasterWorld Top Contributor of All Time 5+ Year Member



 
Msg#: 4534660 posted 12:19 am on Jan 10, 2013 (gmt 0)

That is a bit interesting, because from what I understand about the -950 it was more 'word/structure' oriented, but applied differently to different pages/site/topics ... I'd maybe dig through that a bit more while waiting for Google to get through the pages to see if there's anything that 'sticks' as possible being an issue besides the duplication.

speedshopping

5+ Year Member



 
Msg#: 4534660 posted 12:24 am on Jan 10, 2013 (gmt 0)

It's not the first time we have met 950, obviously we have over-opt that google can just about cope with, but with this added dupe variable it's had enough - one glaring bit of proof was the fact that subdomains that didn't have any cross duplication remained active...

ZydoSEO

WebmasterWorld Senior Member 5+ Year Member



 
Msg#: 4534660 posted 1:56 am on Jan 10, 2013 (gmt 0)

When used in conjunction with the removal tool, it's immediate ... You either have to put a block in the robots.txt or a noindex on the pages to use the tool successfully.


I'm well aware that before using the URL removal tool, you need to block them with a Disallow: in the robots.txt or a meta robots noindex on the pages so that they won't get reindexed.

I was simply clarifying that simply blocking them with a robots.txt Disallow: directive will NOT get the URLs dropped from Google's index immediately. This would have to be done in conjunction with using the URL removal tool if the goal is to remove them immediately.

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Google / Google SEO News and Discussion
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved