| 8:11 am on Dec 10, 2013 (gmt 0)|
This can't be accurate. There will always be discrepancies between submitted and indexed numbers and you can't control how many pages Google chooses to index out of each sitemap.
Breaking down your site into small chunks is a good start, only you have to do this on Analytics. Find out which section has been hit by comparing the traffic before and after the Panda update.
| 9:57 am on Dec 10, 2013 (gmt 0)|
I have done this and it is a good indicator, depending on the quantities of pages you are talinking about.
The sector I was perfoming in. allowed me to created dozens of sitemaps of 100 pages each.
No reason why any of the pages should not be indexed.
I found some sitemaps with 0 listed then others from 25 up to the full 100.
I then discovered trends. IE pages with similar title tags and URLS. (the on page content was considerably different, which is why I did not remove them initially)
I then did different experiments with each sitemap group, until I saw a recovery, then applied the solutions across the board.
| 1:25 pm on Dec 10, 2013 (gmt 0)|
Have you changes titles and URLS to those that haven't been listed (at first) until they were listed/indexed?
| 4:01 pm on Dec 10, 2013 (gmt 0)|
|No reason why any of the pages should not be indexed. |
First, you're assuming that Google's indexing process is perfect.
Second, you're assuming that, if Google is failing to index certain pages, it's because those pages aren't passing the Panda sniff test.
The first assumption is highly questionable, and the second assumption is just a shot in the dark.
| 11:37 am on Dec 11, 2013 (gmt 0)|
Lets just clarify that I answered the orginal question and dont have an issue anymore.
I now have around 98% listed in google to sitemaps.
I got my recovery.
The pages that were not listed, were similar versions of those that were.
Then looking at search criteria, I noticed that traffic wise the pages missing they didn't deliver that much.
As I started to clean these pages away, other better pages returned and showed in sitemaps, it was interesting to see pages clashing that I would not have come to that conslusion so I got an indicator I was on the right track.
As I say, I have a 98% listing in sitmaps compared to a much smaller amount originally.
Traffic is down 20% on the old days, but with 400% less pages on the original site.
Yes 80% of my site pages was delivering only 20% of traffic.
I created a second site with a completely different approach, content structure that looked at those speciality search terms I had lost, and with a relativly small amout of pages, got the last 20% or so traffic back also.
I would say that conversion is down though, as these specific lomg tail terms did convert better, when just a small change in a page content (product info), would relate to a cutomers need more specifically.
| 5:09 pm on Dec 11, 2013 (gmt 0)|
Could you please provide a link to the article you are referring to so that we can take a look?
| 5:32 pm on Dec 11, 2013 (gmt 0)|
Getcooking how can i split my sitemap? i think this trick will help me a lot. I have a lot of suplementary posts (omitted results) in the google serps.
| 5:55 pm on Dec 11, 2013 (gmt 0)|
May I ask what the solutions were that you applied across the board once you'd finished experimenting with each sitemap group?
| 8:11 pm on Dec 11, 2013 (gmt 0)|
I orginally thought the issue was quality of content, so I rewrote different content in a different way in first 3 different subgroups.
Subgroup 4, I focused on removing h1 tags etc
subgroup 5 title tags change
subgroup 6 was sub url names
I got no improvement in any pages(removed from index and number count on sitemaps) returing until I started on urls (subgroup 6).
If I had 3 or 4 urls that had a similar description (actually more specific to the products), I removed certain variations (using remove url tool in wmt even though they were not counted in sitemaps, or appeared in index), until one version reapeared both live and I saw increased index number in sitemaps.
Just to clarify, similar urls: the content was not as similar.
(the sector I am in, would have suggested I needed each page. Similar in url name, but very different in actual real products).
But in the end affected 20% traffic and reduced conversion rate also.
I learned I hadn't got to grip with Panda, focusing way more on the "improve content" aspect, rather than on the urls.
The content had always been good enough and unique, but when Panda went live I was initially convinced thats all it could be.
Once I started choosing which urls to keep and which to let go, the count of other pages I had chosen to keep increased daily.
I then repeated the same process on the other 5 groups, and even with (Now) different content, title tags etc, they too started to come back.
When creating new pages (even with blog posts and news items), I think very carefully about what to name the url and will it be too similar to other pages? Even when the actual written content follows a completly different line of subject.
| 8:22 pm on Dec 11, 2013 (gmt 0)|
Unfortunately the Forum Charter [webmasterworld.com] prohibits us linking to that article, but the article in question suggests putting each of your site categories into a separate sitemap and then comparing the Pages Submitted vs. Pages Indexed for each category. It does not say much more than this on the subject.
It is important that you include in each (sub)sitemap all pages that are allowed to be indexed for that category (I am saying that because sometimes sitemaps include only important pages. For this analysis to be of any use, you must not restrict the sitemap to important pages only)
Read these results with care. If you have duplicate content problem, the reason why the page from the sitemap is not indexed may be because Google picked up a different canonical URL to what you have in the sitemap for the same page.
Create multiple sitemaps (sitemap1.xml, sitemap2.xml etc) with each having URLs from one of categories (or sections) from your website.
Then you can either submit all these different sitemaps to WMT or better, create a Sitemap Index file where you list all sitemaps you created. More on sitemap index file here:
Sitemap index file
| 8:55 pm on Dec 11, 2013 (gmt 0)|
Thanks for the explanation.
So you basically just changed your urls to make them distinct within your site and your rankings recovered. Have I understood that correctly?
Congratulations on your recovery by the way.
| 9:31 pm on Dec 11, 2013 (gmt 0)|
yes I suppose so
Some urls were changed altogether and others i just chose a master url and removed those that were similar.
I normally chose the shortest one.
I didn't change the url/subfolder structure, just the url names within that structure/folder, which in my case was 2 levels in.
It was almost as if the the importance of the url names were more important that the content within.
The urls were similar, but even today I think the actual written content was different enough to deserve being presented as seperate pages .