Forum Moderators: Robert Charlton & goodroi
Sometime within the last few weeks my Google traffic went down to 0 and when I did a site: search on my domain I noticed that it only had 21 pages available in the result but it now reports over 5 milion pages indexed. I liked it much better when I had 100k pages indexed but they were all available in the result.
Question: What is goin on here? Should I expect Google to shortly be listing my 5 milion pages? Or will the 5 milion+ drop down to 21?
But from the sound of it, a bit of housekeeping would do you and the web a favour.
Check your navigation; use robots.txt to avoid the same pages being listed several times, etc.
A feature of dynamic sites that I've noticed before is that once a filter is triggered, items that were borderline that never hurt before, often start hurting.
If you can find a fault and fix it, great. If not, fix everything that MIGHT need fixing!
BTW - check your server; if there's been several or long downtimes, it may just be that.
Is this the same site you are now talking about?
No it's not the same site.
if there's been several or long downtimes
No downtimes.
If you can find a fault and fix it, great.
I'm of the philosophy of "if it aint broke... don't fix it!" However, when it comes to an ever mutating algo... you might be right. Even if it hasn't been broke for years, G may have changed some variable in the last update. I'm a little scared to mess with it though... what if G isnt done updating? It has been nice to have 100k pages indexed and the traffic that came with that.
UPDATE: yesterday my site was back on G again with 111K pages.... this morning its back down to 21 pages.
without knowing more about the site: I think this is the key problem. Google might not like the "5 million" and the "dynamically generated" part.
I understand the phylosophy of the log tail, but
5,000,000 dynamic pages,, where does all the content come from
I was reading a lot off post from people talking about their 100,000 500,000 1,000,000 etc etc page sites a while ago, an it always get me wondering,,
Will google/msn/yahoo always tolerate such sites or are they
going to develop algo to explicitly exclude such sites
Don't get me wrong, I am all for people doing their own thing, i just wonder at the dynamics off it all
Some sour grapes on my part too :-)
In my sector, a couple of giant sites have so many pages indexed, an all pages ranking so heavily, dat my sites can hardly breathe
So, I am not exactly rooting for these million pagers :-)
Okay, i surrender, i going to learn how to make a 10 billion page site
Google might not like the "5 million" and the "dynamically generated" part.
Dynamically generated means that it is database driven. More info on that: It is a multi-parent-child database. Countries X States X Cities X Institutions in the widget industry X other info X even more info = 5 milion plus pages.
I agree, but this site has had between 87K to 100K+ pages indexed for years now. I never thought it would be fully indexed, nor even get 100K pages indexed when I first built it... but my point is that it was for years.
Also the result from "site:mydomain.com" has me a little puzzled -- "Results 1 - 21 of about 5,510,000 from mydomain.com for . (0.25 seconds) " G reports my site having 5,510,000 pages but even with supplemental results past the 21, I can only get to 31 pages. When G reported I had 100K pages, I could get to ALL 100K pages. Why only 31 now?
how different are those pages from each other? ... it's hard to have "unique" info for 5+ million pages.
Yes it is hard and no I dont have unique info on all pages...YET! As time goes by users register for this free sevice and add their unique info. Thousands already have and until now, I was getting more new members every month. The growth rate went from awsome to nil.
In general, if it ain't broke don't fix it is fine - so long as you monitor closely.
But in your case, it is broke (though you need to check on more than one datacenter).
You really do need to reread G's guidelines, you may spot a simple item that can be fixed.
Also, get Matt Cutts on your 'esential reading' list. With a site your size (and I'm assuming it has an income to match) you won't find a better time investment in these changing times.
Good Luck! :)
Make sure that every piece of content has one canonical URL used to access it and keep the spiders OUT of all the alternative URLs for that same content. Make sure all your title and meta description tags are unique.
Check several "GFE" Google datacentres directly, especially gv and eh and you'll see some very different results I expect.
Is there a "click here to see omitted results" link after the last result?
There may, or may not be. Depending on the answer to that, the cause may be slightly different, and the end result will be very different.
If the link is there, Google probably thinks that you have duplicate content and you need to fix it up.
If the link is not there, then Google is probaly de-indexing your entire site; and that may be a spam penalty or something else instead.
Make sure that every piece of content has one canonical URL
Make sure all your title and meta description tags are unique.
Done those years ago which is what I believe got my first 100k pages indexed.
Is there a "click here to see omitted results" link after the last result?
Yes there is. Yesterday I had 21 results and 10 supplemental. Today I have 25 results and 10 supplemental. Today Google aslo reports that it now has 6,320,000 pages of my site indexed. Thats almost a milion pages more than yesterday.
At 4 pages more a day in the result, G should have all my pages in the result within 1,580,000 days... give or take. ;)
Since this started, its been a coin toss whether I get a handful of traffic or not on any given day. Some google servers do and some don't.
I have been working on the duplicate content side of things. I've added a robot.txt to guide google away from pages that are useless to index. I've also added a page-topical news feed to help distinguish each page as an individual.
Today, Google reports 1.2 milion pages and now shows 200 results (209 with ommited results)
Am I moving in the right direction?