Forum Moderators: Robert Charlton & goodroi
The issues we found on our site:
1.Many duplicated content ( canonical ) problems ( different URL's leading to the same physical page , like ...blabla.com/ and ...blabla/index.php or blabla/index.php?=page5 ).
2.Bad internal linking structure ( using different URL's to point to the same content )
3.Some removed pages created gaps in the link structure.
4.Bloated pages ( image viewer, tracking still in the code but not used etc.. )
It's to early to tell if this caused or ranking problems, however we thought cleaning things was needed, especially because the site is still a vital one, with loads of unique content on it.
What we did was:
1.Clean up the mess in the code, remove unneeded stuff
2.Create a new sitemap ( only using the correct URL's, the blabla/ without any additions )
3.Cleaning up the internal links ( conform the sitemap )
4.Removed all unwanted & unneeded pages
5.Make all the articles available on one category page
( were spread out over 15 pages with those nifty blabla/index?=page1 etc.. pages )
6.created a 301 for all blabla/index.php to blabla/
Google ranking did not improve so far, maybe as expected as some of the changes were finalized just a week ago.
What did happen is a Google webmaster tools going berserk which makes me even more itchy....
Things I see in Google webmaster tools:
1.Crawl errors ( not found pages, thats obvious as I removed some )
2.Duplicate title tags for many if not all of our pages... most common problem found is:
/archives/blabla/
/archives/blabla/index.php
The same physical page is listed here as a duplicate, the index.php version has a 301 to the / version.
3.The Sitemap shows about half the URL's indexed, slowly moving up during the last week.
Questions:
1.How much time does it take to so google doesn't see the / and /index.php as a duplicated page anymore?
2.Is there a way to clear this manually? ( 301 is in place already )
3.Did anyone see a site recover after “house cleaning” and after how much time?
4.Who else had a big drop around the 26 October, and has some insights on what could be the thing changed from Google's side?
My excuses upfront for the lengthy post..
My experience is that when Google sees bigger changes to the site URLs (new redirects, changes in internal linking structure, etc), it does crawl much more, but it takes time for this to be reflected in its index. Almost as it says "I am not sure you wanted to do this so I will wait a bit for a site to became stable before reflecting this in SERPs".
How long - it can take between a month up to 6 months and sometimes even longer, depending on number of pages and the change you made. I feel that the trust of the site plays a factor here, it appears that older and more trusted sites recover their ranking quicker.
When we did our house cleaning in February, we started to improve our rankings after about 6 weeks.
If you had many duplicate content pages (you are mentioning up to 15 URLs pointing to the same content), and if some of these had external / internal links, then by consolidating these to one URL only should strengthen your link juice for that page and this by itself should improve ranking.
And a word of warning - it is possible that you could see the site first dropping even further before it is starting to recover. Do not panic and just wait, with the changes you described you should surpass your current ranking