Forum Moderators: Robert Charlton & goodroi
It's strange because my traffic is only a little lower, maybe 10%.
I know my site has more than 1190 pages, a it's dynamically built, and I double checked that each has a unique title, meta, and content, so there is no problem there.
Google crawls my homepage about every 12 hours.
Should I be worried?
Thanks guys
Could this be related? Let's keep this discussion to those of us who have noticed their site: results fall the last few days and weeks.
I was having a lot of problems with the site and duplicate content. It seems there were several ways of getting to the same page (different URLs) and as we know, this can be a bad thing. The site has a forum that has generated 16,000 topics (some of them on multiple pages) so in essence, I am going to estimate that I have around 19,000 pages total on the site. Now at the height of the duplicate problem, when I did a site:mysite.com, I was getting over 80,000 pages returned. WOW! I fixed all the dupe content issues and now each page has one URL and each has a uniquely generated title, description, keywords and of course, the content is different since it is user generated. I used robots.txt to get rid of the duplicated pages and started to watch what would happen. This seemed to have corrected the problem. Pages started going supplemental and dropped, as far as I can tell. But the pendulum seemed to have swung too far! Within the past month, the number of pages returned using site: have been slowly dropping. Now when I do a site:mysite.com, it only shows 4000 pages. Huh? What’s the deal with that? Not only that, when I do a site:mysite.com/*, I only get about 800 pages. So I am confused, of course. But are the missing pages really not there? I conducted about 200 searches for the pages that I thought were missing and found every single one of them, though the searches were fairly specific. So what does this tell me? The site: operator does not work. All of my pages are there, it’s just Google doesn’t want to count them all with this operator. What does this mean? Not sure, but it is what it is. For every page I find missing, I can find in a search. The tool seems to be broken - like a lot of the tools on G.
I also was told in another thread that AOL has a more accurate count of pages in G's index as they supply the results.
My best guess is that this is simply a change in how google reports site figures. Our page counts have been dropping by a couple thousand pages per week for a month and continues to drop.
All rankings stable and some increases actually. 10 year old site, PR 7, around 1 million backlinks.
My site is organized alphabetically a-z, so i did a check on site:www.example.com/z/ for the letter "z". I got 126 results only. It would normally be a few THOUSAND
Then I started doing subdirectories from Z, like site:www.example.com/z/subDir1, site:www.example.com/z/subDir2/... and was pleasantly surprised to see all my pages in there. So the actually pages indexed is much higher than 126. Basically it says "126" but actually has a couple thousand this is strange, i hope they fix this as it freaks me out.
[edited by: tedster at 6:42 am (utc) on Nov. 16, 2008]
[edit reason] switch to example.com - it can never be owned [/edit]
I agree, there doesn't seem to be the type of drop in traffic that would be expected for this many pages being missing in the index.
Noticing that when clicking the cache link on the pages in the index, that many of them don't return a cached version. Is this typical? Seems to be quite a lot of them.
One other observation is that we provide template type content pages for a number of independently owned webstores that use the same folder and file formats attached to different domains These pages have always been prety steady in the index for each domain, although don't rank very well except on the sites that have built some inbound links etc.
The crazy thing is that finding these pages in the index for one of the sites through search site:www.sitenumber1.com/folder1 and then clicking on the cache link and seeing that the cache version of some of the pages is showing from www.sitenumber2.com. The cache link url and the version that the cache link land on showing different domains.
Some pages that sre too similar to get filtered out based on duplicate content within a site, but never saw this type of thing before, and wonder if this is just part of the glitch, or we have to make sure that no two sites use the same /folder1/file1.htm addresses in template driven pages. Any thoughts on this appreciated.
I agree, there doesn't seem to be the type of drop in traffic that would be expected for this many pages being missing in the index.
Noticing that when clicking the cache link on the pages in the index, that many of them don't return a cached version. Is this typical? Seems to be quite a lot of them.
One other observation is that we provide template type content pages for a number of independently owned webstores that use the same folder and file formats attached to different domains These pages have always been prety steady in the index for each domain, although don't rank very well except on the sites that have built some inbound links etc.
The crazy thing is that finding these pages in the index for one of the sites through search site:www.sitenumber1.com/folder1 and then clicking on the cache link and seeing that the cache version of some of the pages is showing from www.sitenumber2.com. The cache link url and the version that the cache link land on showing different domains.
Some pages that sre too similar to get filtered out based on duplicate content within a site, but never saw this type of thing before, and wonder if this is just part of the glitch, or we have to make sure that no two sites use the same /folder1/file1.htm addresses in template driven pages. Any thoughts on this appreciated.
Google is clearly changing something in the way they store data. That change makes calculating the site: numbers (which are only ever estimates) a problem. We've gone through cycles like this before.