| 4:52 pm on Dec 18, 2008 (gmt 0)|
Hello BBonanza, and welcome to the forums.
One thing to keep in mind is that Google often shuffles their back end as they move data around, preparing new algo factors and new infrastructure configurations. The site: numbers are almost never accurate for sites of any size - they say "about" and the numbers can shift dramatically as you drill into the deeper results pages for a site: operator query.
Another factor you can notice even on some small sites is that Deeper Site: Queries Can Returns More URLs [webmasterworld.com].
| 5:17 pm on Dec 18, 2008 (gmt 0)|
I've seen this too - and I can't dig deeper as the pages are at the www.domain.com/page-name level.
One thing I'm hypothesising with our site is that the pages are too similar, so we're working on making each one as unique as possible.
Google still crawls the pages happily enough, and will index up to 30,000 or so, but regularly drops a load so the number indexed goes back to less than 1000 sometimes.
We've already added 'noindex' to the pages we don't consider are unique enough for Google to like at present. We rely on user added content for many pages, and sometimes they just don't add enough! Fingers crossed this is going to work - it only went in this week.
You say you have similar pages which show different products - is the rest of the content on those pages the same, apart from the product content? Because that's a little like our pages which get dropped.
| 5:42 pm on Dec 18, 2008 (gmt 0)|
That sounds like what Google calls "stub pages" - it's definitely worth handling those well, to whatever degree you can. More unique content for them is the best I'd say, but noindex is one possible fallback.
| 8:52 am on Dec 19, 2008 (gmt 0)|
Thanks for the replies guys
Yes, our products are different, but we code the differences into our titles, meta data, heading text, alt tags etc etc - we did have a block of text which was identical aside from the product name for every page. We have dumped this because it may be seen as dupe content and keyword ramming.
We noindex the offset result pages in a product list because google seems to give them no ranking at all.
We really try to keep it clean.
We have noticed that the new "promote and remove" icons keep appearing and disappearing next to the results, so there are obvious changes in the pipeline which may affect the way Google is working.
But hey, it is all summise summise in google world.
| 10:53 am on Dec 19, 2008 (gmt 0)|
Sorry - the above may have been a bit misleading.
"promote and remove" icons keep appearing and disappearing next to the results IN GOOGLE
| 4:45 pm on Jan 5, 2009 (gmt 0)|
OK, we've solved the problem I mentioned above with our site.
The problem was that the pages were too similar (or possibly that there were just too many of them).
To solve it we've added 'noindex' to all pages we do not consider unique enough - do not have enough unique content. I've blocked two sub-domains completely using the robots.txt file for the time being while we're working on making those pages more unique.
When we free up those sub-domains again we will never let as many pages be available for indexing as we had before (200,000), only those which have sufficient user-generated content.
Having put the changes in and submitted a reconsider request to Google, our homepage and our Christmas page (thank you Google!) are back in number 1 position for our required search term, but we're still working on the other pages and other search terms.
BBonanza - you mention that you code your page titles, meta data and headings to be different for each page but so did we. I'm guessing your products still sit within what is effectively a template with the same menu, website heading etc. etc on each page? Maybe you have to find some way to increase the unique content on your pages.
| 6:45 pm on Apr 6, 2009 (gmt 0)|
I'm noticing a drop in the amount of url's indexed by Google on my website over the last 2 months. I would be interested in everyone's opinion on whether the lack of unique content is better, worse or the same as having duplicate content. I have pages where the content is approximately 90% the same (but in different formats). While these are a small percentage of my pages, I'm still concerned about generating enough unique content in a dynamic fashion. I appreciate any input.